virtualgate's picture

SDT, Runtime Env, Itermediate Code Generation

Introduction to Syntax directed translation

Definition

Syntax Directed Translation are augmented rules to the grammar that facilitate semantic analysis. SDT involves passing information bottom-up and/or top-down the parse tree in form of attributes attached to the nodes. Syntax directed translation rules use

1) lexical values of nodes,

2) constants , 

3) attributes associated to the non-terminals in their definitions.

 

The general approach to Syntax-Directed Translation is to construct a parse tree or syntax tree and compute the values of attributes at the nodes of the tree by visiting them in some order. In many cases, translation can be done during parsing without building an explicit tree.

 

Example:

E -> E+T | T

T -> T*F | F

F -> INTLIT

 

This is a grammar to syntactically validate an expression having additions and multiplications in it. Now, to carry out semantic analysis we will augment SDT rules to this grammar, in order to pass some information up the parse tree and check for semantic errors, if any. In this example we will focus on evaluation of the given expression, as we don’t have any semantic assertions to check in this very basic

example.

E -> E+T { E.val = E.val + T.val } PR#1

E -> T { E.val = T.val } PR#2

T -> T*F { T.val = T.val * F.val } PR#3

T -> F { T.val = F.val } PR#4

F -> INTLIT { F.val = INTLIT.lexval } PR#5

 

For understanding translation rules further, we take the first SDT augmented to [ E -> E+T ] production rule. The translation rule in consideration has val as attribute for both the non-terminals – E & T. Right hand side of the translation rule corresponds to attribute values of right side nodes of the production rule and vice-versa. Generalizing, SDT are augmented rules to a CFG that associate

1) set of attributes to every node of the grammar and

2) set of translation rules to every production rule using attributes, constants and lexical values.

 

Let’s take a string to see how semantic analysis happens – S = 2+3*4. Parse tree corresponding to S would be

 

To evaluate translation rules, we can employ one depth first search traversal on the parse tree. This is possible only because SDT rules don’t impose any specific order on evaluation until children attributes are computed before parents for a grammar having all synthesized attributes. Otherwise, we would have to figure out the best suited plan to traverse through the parse tree and evaluate all the attributes in one or more traversals. For better understanding, we will move bottom up in left to right fashion for computing translation rules of our example.

 

 

 

Above diagram shows how semantic analysis could happen. The flow of information happens bottom-up and all the children attributes are computed before parents, as discussed above. Right hand side nodes are sometimes annotated with subscript 1 to distinguish between children and parent.
Additional Information
Synthesized Attributes are such attributes that depend only on the attribute values of children nodes.
Thus [ E -> E+T { E.val = E.val + T.val } ] has a synthesized attribute val corresponding to node E. If all the semantic attributes in an augmented grammar are synthesized, one depth first search traversal in any order is sufficient for semantic analysis phase.

Inherited Attributes are such attributes that depend on parent and/or siblings attributes.
Thus [ Ep -> E+T { Ep.val = E.val + T.val, T.val = Ep.val } ], where E & Ep are same production symbols annotated to differentiate between parent and child, has an inherited attribute val corresponding to node T.

Contributor's Info

Created:
0Comment
Syntax directed translation examples
Content covered: 

Syntax directed translation examples

More Less
0Comment
Examples of SDT
Content covered: 

Examples of SDT

More Less
0Comment
S attributed and L attributed definitions
Content covered: 

S attributed and L attributed definitions

More Less
0Comment
Introduction to intermediate code generation

Intermediate code generator receives input from its predecessor phase, semantic analyzer, in the form of an annotated syntax tree. That syntax tree then can be converted into a linear representation, e.g., postfix notation. Intermediate code tends to be machine independent code.

1)If a compiler translates the source language to its target machine language without having the option for generating intermediate code, then for each new machine, a full native compiler is required.

2)Intermediate code eliminates the need of a new full compiler for every unique machine by keeping the analysis portion same for all the compilers.

3)The second part of compiler, synthesis, is changed according to the target machine.

4)It becomes easier to apply the source code modifications to improve code performance by applying code optimization techniques on the intermediate code. intermediate code can be either language specific (e.g., Byte Code for Java) or language independent (three-address code).

 

Three-Address Code

Intermediate code generator receives input from its predecessor phase, semantic analyzer, in the form of an annotated syntax tree. That syntax tree then can be converted into a linear representation, e.g., postfix notation. Intermediate code tends to be machine independent code. Therefore, code generator assumes to have unlimited number of memory storage (register) to generate code.

For example:

a = b + c * d;

The intermediate code generator will try to divide this expression into sub-expressions and then generate the corresponding code.

r1 = c * d;

r2 = b + r1;

a = r2

r being used as registers in the target program.

A three-address code has at most three address locations to calculate the expression. A three-address code can be represented in three forms : quadruples , triples, indirect triple.

Quadruples

 

Each instruction in quadruples presentation is divided into four fields: operator, arg1, arg2, and result. The above example is represented below in quadruples format:

Op arg1 arg2 result
* c d r1
+ b r1 r2
+ r2 r1 r3
= r3   a

 

 

Triples

Each instruction in triples presentation has three fields : op, arg1, and arg2.The results of respective sub-expressions are denoted by the position of expression. Triples represent similarity with DAG and syntax tree. They are equivalent to DAG while representing expressions.

Op arg1 arg2
* c d
+ b (0)
+ (1) (0)
= (2)  

 

Triples face the problem of code immovability while optimization, as the results are positional and changing the order or position of an expression may cause problems.

 

 

Indirect Triples

This representation is an enhancement over triples representation. It uses pointers instead of position to store results. This enables the optimizers to freely re-position the sub-expression to produce an optimized code.

disadvantages of using indirect triples is we have to access two memory due to pointer.

 

for more details one can refer to below link:

https://www.slideshare.net/rawan_z/intermediate-code-generation

Contributor's Info

Created:
0Comment
Types of 3 address code and example

Here we have 3 address code which have below types as:

1) x=y Op z

2) x=Op z

3) x=y

4) if x (relational Op) y goto L

5) goto L

6) A[i] = x

7) y=A[i]

8) x=*P

8) y=&x

 

 

Example:

-(a+b)*(c+d)+(a+b+c)

given instruction , you should change it into 3 address code 

sol:-

t1=a+b

t2=-t1

t3=c+d

t4=t2*t3

t5=a+b

t6=t5+c

t7=t4+t6

 

Contributor's Info

Created:
0Comment
A Vision to Run time Environments

Run time environments means how a program should be loaded in the operating system and what is the support that your program needs.

                  By runtime, we mean a program in execution. Runtime environment is a state of the target machine, which may include software libraries, environment variables, etc., to provide services to the processes running in the system.

               Runtime support system is a package, mostly generated with the executable program itself and facilitates the process communication between the process and the runtime environment. It takes care of memory allocation and de-allocation while the program is being executed.

 

Activation Trees

A program is a sequence of instructions combined into a number of procedures. Instructions in a procedure are executed sequentially. A procedure has a start and an end delimiter and everything inside it is called the body of the procedure. The procedure identifier and the sequence of finite instructions inside it make up the body of the procedure. The execution of a procedure is called its activation.

                  We assume that the program control flows in a sequential manner and when a procedure is called, its control is transferred to the called procedure. When a called procedure is executed, it returns the control back to the caller. This type of control flow makes it easier to represent a series of activations in the form of a tree, known as the activation tree.

 

Storage Allocation

Runtime environment manages runtime memory requirements for the following entities:

 

Code : It is known as the text part of a program that does not change at runtime. Its memory requirements are known at the compile time.

Procedures : Their text part is static but they are called in a random manner. That is why, stack storage is used to manage procedure calls and activations.

Variables : Variables are known at the runtime only, unless they are global or constant. Heap memory allocation scheme is used for managing allocation and de-allocation of memory for variables in runtime

 

1) Static:

  • Allocation is done at compile time.
  • Bindings do not change.
  • one activation record per procedure.
  • array by default is static.
  • Activations (function call) can have permanent lifetime in case of static allocation.

Disadvantages:

  • Recursion is not supported.
  • sizeof data object must be be known at compile time.
  • data structure can't be created dynamically.

 

2) Stack

  • whenever a new activation begins activation record is pushed on to the stack and whenever activation ends ,activation record is popped off.
  • local variable are bound to fresh storage.
  • Activations  can have nested lifetime in case of stack allocation.

 

Disadvantages:

  • local variable can't be retained once activation end.

i.e f()

         { int count=0;

          count++;

 

          }

 

3) Heap:

  • Allocation and deallocation canbe in any order.
  • In c programming language to management the hole or free space in heap we are using First fit.
  • Activations can have Arbitrary  lifetime in case of heap allocation.

Disadvantage:

Heap management is overhead.

 

Contributor's Info

Created:
0Comment

  • This quiz contains 5 questions on the topic Syntax Directed Tree
  • Lean well before you attempt the quiz
  • You can attempt the quiz unlimited number of times.

Difficulty Level:  intermediate

  • This quiz contains 5 questions on the topic Run Time Environment
  • Lean well before you attempt the quiz
  • You can attempt the quiz unlimited number of times.

Difficulty Level:  intermediate

  • This quiz contains 5 questions on the topic Static Single Assignment and Intermediate Code Generation
  • Lean well before you attempt the quiz
  • You can attempt the quiz unlimited number of times.

Difficulty Level:  intermediate