Review products and earn money. Learn more
Earn advertising credits on boddunan. Click here to learn more.
Articles Education Engineering Here Is the Structure Of An Compiler

Here Is the Structure Of An Compiler

 


Compiler



Hello Friends in my this semester i have a subject compiler design and after studying a lot about that subject i am writing a article about compiler which can be helpful for computer science students.A compiler as a single box that maps a source program into a semantically equivalent target program. If we open up this box a little, we see that there are two parts to this mapping: analysis and synthesis. The analysis part breaks up the source program into constituent pieces and imposes a grammatical structure on them.


It then uses this structure to create an intermediate representation of the source program. If the analysis part detects that the source program is either syntactically ill formed or semantically unsound, then it must provide informative messages, so the user can take corrective action.


The analysis part also collects information about the source program and stores it in a data structure called a symbol table, which is passed along with the intermediate representation to the synthesis part. The synthesis part constructs the desired target program from the intermediate representation and the information in the symbol table. The analysis part is often called the front end of the compiler; the synthesis part is the back end. If we examine the compilation process in more detail, we see that it operates as a sequence of phases, each of which transforms one representation of the source program to another. A typical decomposition of a compiler into phases.


In practice, several phases may be grouped together, and the intermediate representations between the grouped phases need not be constructed explicitly. The symbol table, which stores information about th entire source program, is used by all phases of the compiler. Some compilers have a machine-independent optimization phase between the front end and the back end. The purpose of this optimization phase is to perform transformations on the intermediate representation, so that the back end can produce a better target program than it would have otherwise produced from an unoptimized intermediate representation. Since optimization is optional, one or the other of the two optimization phases may be missing.


Lexical Analysis


The first phase of a compiler is called lexical analysis or scanning. The lexical analyzer reads the stream of characters making up the source progra and groups the characters into meaningful sequences called lexemes. For each lexeme, the lexical analyzer produces as output a token of the form (token-name, attribute-value) that it passes on to the subsequent phase, syntax analysis. In the token, the first component token-name is an abstract symbol that is used during syntax analysis, and the second component attribute-value points to an entry in the symbol table for this token. Information from the symbol-table entry 'is needed for semantic analysis and code generation.


For example, suppose a source program contains the assignment statement


position = i n i t i a l + r a t e * 60


The characters in this assignment could be grouped into the following lexemes and mapped into the following tokens passed on to the syntax analyzer:



  • Position is a lexeme that would be mapped into a token (id, I), where  is an abstract symbol standing for identifier and 1 points to the symboltable entry for position. The symbol-table entry for an identifier holds information about the identifier, such as its name and type.





  • The assignment symbol = is a lexeme that is mapped into the token (=). Since this token needs no attribute-value, we have omitted the second component. We could have used any abstract symbol such as assign for the token-name, but for notational convenience we have chosen to use the lexeme itself as the name of the abstract symbol.





  • Initial is a lexeme that is mapped into the token (id, 2), where 2 points to the symbol-table entry for i n i t i a l. + is a lexeme that is mapped into the token (+).Rate is a lexeme that is mapped into the token (id, 3), where 3 points to the symbol-table entry for r a t e .





  • * is a lexeme that is mapped into the token (*) .

  • 60 is a lexeme that is mapped into the token (60) .'


Blanks separating the lexemes would be discarded by the lexical analyzer.


In this representation, the token names =, +, and * are abstract symbols for the assignment, addition, and multiplication operators, respectively. 'Technically speaking, for the lexeme 60 we should make up a token like (number,4), where 4 points to the symbol table for the internal representation of integer 60


Syntax Analysis


The second phase of the compiler is syntax analysis or parsing. The parser uses the first components of the tokens produced by the lexical analyzer to create a tree-like intermediate representation that depicts the grammatical structure of the token stream. A typical representation is a syntax tree in which each interior node represents an operation and the children of the node represent the arguments of the operation. A syntax tree for the token stream is shown as the output of the syntactic analyzer


The tree has an interior node labeled * with (id, 3) as its left child and the integer 60 as its right child. The node (id, 3) represents the identifier rate. The node labeled * makes it explicit that we must first multiply the value of r a t e by 60. The node labeled + indicates that we must add the result of this multiplication to the value of i n i t i a l . The root of the tree, labeled =, indicates that we must store the result of this addition into the location for the identifier posit ion. This ordering of operations is consistent with the usual conventions of arithmetic which tell us that multiplication has higher precedence than addition, and hence that the multiplication is to be performed before the addition.


The subsequent phases of the compiler use the grammatical structure to help analyze the source program and generate the target program. we shall use context-free grammars to specify the grammatical structure of programming languages and discuss algorithms for constructing efficient syntax analyzers automatically from certain classes of grammars.


Semantic Analysis


The semantic analyzer uses the syntax tree and the information in the symbol table to check the source program for semantic consistency with the language definition. It also gathers type information and saves it in either the syntax tree or the symbol table, for subsequent use during intermediate-code generation. An important part of semantic analysis is type checking, where the compiler checks that each operator has matching operands.


For example, many programming language definitions require an array index to be an integer; the compiler must report an error if a floating-point number is used to index an array. The language specification may permit some type conversions called coercions. For example, a binary arithmetic operator may be applied to either a pair of integers or to a pair of floating-point numbers. If the operator is applied to a floating-point number and an integer, the compiler may convert or coerce the integer into a floating-point number. Suppose that position, i n i t i a l , and rate have been declared to be floating-point numbers, and that the lexeme 60 by itself forms an integer.


The type checker in the semantic analyzer  discovers that the operator * is applied to a floating-point number r a t e and an integer 60. In this case, the integer may be converted into a floating-point number. Notice that the output of the semantic analyzer has an extra node for the operator int to float, which explicitly converts its integer argument into a floating-point number.


Intermediate Code Generation


In the process of translating a source program into target code, a compiler may construct one or more intermediate representations, which can have a variety of forms. Syntax trees are a form of intermediate representation; they are commonly used during syntax and semantic analysis.


After syntax and semantic analysis of the source program, many compilers generate an explicit low-level or machine-like intermediate representation, which we can think of as a program for an abstract machine. This intermediate representation should have two important properties: it should be easy to produce and it should be easy to translate into the target machine.


we consider an intermediate form called three-address code, which consists of a sequence of assembly-like instructions with three operands per instruction. Each operand can act like a register. The output of the intermediate code generator  consists of the three-address code sequence.


There are several points worth noting about three-address instructions. First, each three-address assignment instruction has at most one operator on the right side. Thus, these instructions fix the order in which operations are to be done; the multiplication precedes the addition in the source program . Second, the compiler must generate a temporary name to hold the value computed by a three-address instruction. Third, some "three-address instructions" like the first and last in the sequence above, have fewer than three operands.  we cover the principal intermediate representations used in compilers.


Code Optimization


The machine-independent code-optimization phase attempts to improve the intermediate code so that better target code will result. Usually better means faster, but other objectives may be desired, such as shorter code, or target code that consumes less power. For example, a straightforward algorithm generates the intermediate code,using an instruction for each operator in the tree representation that comes from the semantic analyzer.


A simple intermediate code generation algorithm followed by code optimization is a reasonable way to generate good target code. The optimizer can deduce that the conversion of 60 from integer to floating point can be done once and for all at compile time, so the int to float operation can be eliminated by replacing the integer 60 by the floating-point number 60.0. Moreover, t3 is used only once to transmit its value to id1 so the optimizer can transform into the shorter sequence


There is a great variation in the amount of code optimization different compilers perform. In those that do the most, the so-called "optimizing compilers," a significant amount of time is spent on this phase. There are simple optimizations that significantly improve the running time of the target program without slowing down compilation too much.


Code Generation


The code generator takes as input an intermediate representation of the source program and maps it into the target language. If the target language is machine code, registers or memory locations are selected for each of the variables used by the program. Then, the intermediate instructions are translated into sequences of machine instructions that perform the same task.


A crucial aspect of code generation is the judicious assignment of registers to hold variables. The first operand of each instruction specifies a destination. The F in each instruction tells us that it deals with floating-point numbers. The code in loads the contents of address id3 into register R2, then multiplies it with floating-point constant 60.0. The # signifies that 60.0 is to be treated as an immediate constant. The third instruction moves id2 into register R1 and the fourth adds to it the value previously computed in register R2. Finally, the value in register R1 is stored into the address of i d l , so the code correctly implements the assignment statement


Symbol-Table Management


An essential function of a compiler is to record the variable names used in the source program and collect information about various attributes of each name. These attributes may provide information about the storage allocated for a name, its type, its scope (where in the program its value may be used), and in the ca,se of procedure names, such things as the number and types of its arguments, the method of passing each argument (for example, by value or by reference), and the type returned. The symbol table is a data structure containing a record for each variable name, with fields for the attributes of the name. The data structure should be designed to allow the compiler to find the record for each name quickly and to store or retrieve data from that record quickly.


The Grouping of Phases into Passes


The discussion of phases deals with the logical organization of a compiler. In an implementation, activities from several phases may be grouped together into a pass that reads an input file and writes an output file. For example, the front-end phases of lexical analysis, syntax analysis, semantic analysis, and intermediate code generation might be grouped together into one pass.


Code optimization might be an optional pass. Then there could be a back-end pass consisting of code generation for a particular target machine. Some compiler collections have been created around carefully designed intermediate representations that allow the front end for a particular language to interface with the back end for a certain target machine. With these collections, we can produce compilers for different source languages for one target machine by combining different front ends with the back end for that target machine. Similarly, we can produce compilers for different target machines, by combining a front end with back ends for different target machines.


Compiler-Construction Tools


The compiler writer, like any software developer, can profitably use modern software development environments containing tools such as language editors, debuggers, version managers, profilers, test harnesses, and so on. In addition to these general software-development tools, other more specialized tools have been created to help implement various phases of a compiler. These tools use specialized languages for specifying and implementing specific components, and many use quite sophisticated algorithms. The most successful tools are those that hide the details of the generation algorithm and produce components that can be easily integrated into the remainder of the compiler. Some commonly used compiler-construction tools include


 


1. Parser generators that automatically produce syntax analyzers from a grammatical description of a programming language.


2. Scanner generators that produce lexical analyzers from a regular-expression description of the tokens of a language.


3. Syntax-directed translation engines that produce collections of routines for walking a parse tree and generating intermediate code.


4. Code-generator generators that produce a code generator from a collection of rules for translating each operation of the intermediate language into the machine language for a target machine.


5. Data-flow analysis engines that facilitate the gathering of information about how values are transmitted from one part of a program to each other part. Data-flow analysis is a key part of code optimization.


6. Compiler-construction toolk2ts that provide an integrated set of routines for constructing various phases of a compiler




User reviews

There are no user reviews for this listing.

To write a review please register or log in.
 
 
 
Author Articles
More from this author:
Something About >> How To Traverse XML (181 Hits)
Programming > Others
  How To Traverse XML   To begin this article, we take a look at how to make the transformations dangled in front of you throughout the last...
Transforming XML (172 Hits)
Programming > Others
Transforming XML If you are a backend systems developer or a systems architect, you should be seeing the value ofXML by now. A language that...
About->>Motherboard (551 Hits)
Computers & Technology > Hardware & Troubleshooting
Motherboard: The motherboard is, without a doubt, the primary component of the entire system. Without the support circuitry and functions this...
Something About>> Recordable Drives (140 Hits)
Computers & Technology > Hardware & Troubleshooting
Recordable Drives Optical devices such as CD –ROMs and DVD –ROMs are good data storage solutions. They hold a lot of data. But the biggest...
Something About >> How To Traverse XML(Part2) (200 Hits)
Computers & Technology > New Technologies
The Document Object Model (DOM) The Document Object Model, unlike SAX, has its origins in the World Wide Web Consortium (W3C). Whereas SAX...
INTRODUCTION TO INTERNET (295 Hits)
Computers & Technology > New Technologies
INTRODUCTION TO INTERNET Internet is a worldwide network that is a widely used to connect universities, government offices, companies and private...
Get Knowledge About WEB BROWSER (290 Hits)
Computers & Technology > New Technologies
WEB BROWSERS Most browsers have a point and click interface-the browser displays information on the computer’s screen and permit a user to...
Related Articles
Related Articles:
Compiler & Phases of Compiler (4444 Hits)
vijay_u
Miscellaneous > General Reference
Before we know about the phases of compiler,we must know waht is compiler. Compiler : Compiler is a simple program  which reads a...
Compiler Construction Tools (1095 Hits)
vijay_u
Education > Engineering
There are many compiler construction tools available to enhance the compiler performance.Some of them 1.Parser Generators: These generators...
Understanding how compiler was developed (316 Hits)
vinay
Programming > C & C++
Hello everyone, This is my first article on this site however there are lots of computer related articles which I have written for other resources....
DATA STRUCTURE PART-1 (260 Hits)
JiraiyaRaj
Education > Computer Science
    DATA STRUCTURE PART-1 In this part-1 of data structure we are going to see some of the types of data structures and their definitions....
Information on Pesticide and its Etymology, History, Structure and Consumption (207 Hits)
Xerox1rocks
Miscellaneous > General Reference
Pesticide A pesticide is a substance released into a culture to fight against organisms harmful. It is a generic term which includes the...
Structure of DNA: Primary and Secondary structure, Tertiary structure, Double helix structure (2985 Hits)
Xerox1rocks
Miscellaneous > General Reference
Structure of DNA DNA is a molecule duplex, ie consists of two chains arranged in antiparallel manner and with the nitrogenous bases facing each...
Cell: Structure and expression, Energy Conversion and Cytoskeleton (241 Hits)
Xerox1rocks
Miscellaneous > General Reference
Cell: Structure and expression The DNA and various levels of packaging. Eukaryotic cells have their genetic material, usually a single...
Latest Articles
Latest Articles:
Transport System In India (6 Hits)
sajeetharan
People & Places > Social Life
It would be nice to discuss the public transport facilities that we have in our cities.Let me share How much we are satisfied and how we we are...
Device driver (4 Hits)
Binief
Computers & Technology > New Technologies
A device driver represents a computer programme that admits higher-level PC programs to communicate with a machine. It checks the interractions...
Writing an article effectively (3 Hits)
Binief
Miscellaneous > General Reference
Do you would like to know how you are able to compose an article in under thirty minutes? Whenever you would like to be more creative in the...
Be proud of our 'Sardars' ... Must read! (5 Hits)
sunaina badhwar
Entertainment > Jokes & Humour
We all love those santa-banta jokes.But do you know that sikhs are one of the most hard-working and prosperous communities of the world. After...
Asus J 502 (48 Hits)
sunnyrsingh
Reviews > Mobile Phones
The Asus J 502 is a Slider-Touchscreen phone comes with 24MB internal memory and memory can be expanded by microSD card. It has 3 megapixel camera...
Education System In India (19 Hits)
sajeetharan
Education > Careers
Indian Education system is going through progess period and will emerge to be great.Its not easy to spread literacy in a country with population...
Asus Z801 (45 Hits)
sunnyrsingh
Reviews > Mobile Phones
The Asus Z801 is a Clamshell-Touchscreen phone comes with 64MB internal memory and memory can be expanded by microSD card. It has 2 megapixel camera...
Most Read Articles
Most Read Articles:
Money making programs on boddunan.com (10180 Hits)
bnadmin
Boddunan.com is not only a platform to share knowledge and learn from the articles posted by other members but also provides an oportunity to make...
CANDICE BERGEN (9745 Hits)
xaskoiking
People & Places > Celebrities
Hai friends in this article let us discuss about the 5 time Emmy award winner and 2 time Gold Globe award winning American Actress and model CANDICE...
Understanding the Anatomy of Abdominal Muscles (8278 Hits)
soni2006
Education > Medicine & Surgery
To do any exercise correctly you need to know which muscles are involved and the movements they perform. Anatomically locating these...
History of Transportation – Your Complete Guide (8227 Hits)
soni2006
People & Places > Creators
Definition of transportation Transportation is a means of moving people or goods from one place to another. The modern commercial...
Origami (8152 Hits)
ktkps
Miscellaneous > General Reference
We all would at some point in our childhood (and in college) folded papers to make rocket and other things.But we do not know the name for that...
TOP 10 BICEPS EXERCISES PART 1 (7922 Hits)
sapu
Health & Fitness > Yoga & Excercises
Hi friends, well this article is exclusively for those men and may be, who knows even some girls who want to shape up their biceps by simple...
Jallianwala Bagh Massacre (7336 Hits)
xaskoiking
Miscellaneous > General Reference
#) More than 379 dead and 1500 plus were injured..... These are the number of victims of the cruel massacre by  a cruel man called dyer #)Shock...

Trackback(0)

TrackBack URI for this entry

Comments (4)

Subscribe to this comment's feed
...
id_crisis
nice info.....keep posting.....
id_crisis , March 24, 2010
...
kavishanigam
you present this very nicely keep it up
kavishanigam , July 19, 2010
...
kavishanigam
nice article and very helpful
kavishanigam , July 19, 2010
...
kavishanigam
Good work .. Its useful. keep it up
kavishanigam , July 23, 2010

Write comment

smaller | bigger
security image
Write the displayed characters

busy

Sponsored Links

Leader Board

This Week
sajeetharan (2698 Points)
neetu020784 (1708 Points)
sunnyrsingh (892 Points)
jayen (830 Points)
bolisettypradeep (800 Points)
kdeepti (800 Points)
kumaresh (744 Points)
jasmeetsingh (650 Points)
ra_k (597 Points)
kalyani (585 Points)
Last
aasthagupta36 (1932 Points)
neetu020784 (1661 Points)
vijuagrawal (1596 Points)
sunnyrsingh (1263 Points)
kdeepti (1120 Points)
jobin (888 Points)
kalyani (836 Points)
ra_k (634 Points)
kumaresh (516 Points)
kavishanigam (428 Points)
Updated every 1 hour(s)

You are not logged in.

Boddunan Stats

  • 6819 registered
  • 8 today
  • 39 this week
  • 549 this month
  • Last: emad981

Subscribe Newsletter

Server Time