Code Optimizer | Software Engineering

Code optimizer:

This phase is also called as optional phase. If the intermediate representation is good enough to develop the object code, then the compiler will skip the process of optimizing the intermediate representation.

Else, this phase will optimize the intermediate representation. Optimization can be complex; it may involve numerous sub phases, which may need to be applied more than once. Optimizations may be turned off to speed up the  translation.

Contributor's Info

Created:
0Comment
Code Optimization

                                                                    CONTENTS

 

List of topics that we cover in this article:

  • Introduction
  • What is code optimization?
  • What are the issues to be considered while applying the optimization?
  • Factors that code become inefficient
  • Where exactly code optimization can be applied?
  • Code optimization
    • Machine Dependent
    • Machine Independent
  • Scope of Operation
    • Local Scope
    • Global Scope
  • Local optimization
    • Common subexpression Elimination
    •  Copy propagation
    • Dead code Elimination
    • Constant folding
  • Loop optimization
    • Loop invariant
    • Induction variables
    • Reduction in strength
    • Loop unrolling
    • Loop jamming
  • Basic blocks
  • Partial redundancy removal
  • Conclusion.

 

CODE OPTIMIZATION

Introduction:

What is code optimization?

  • Code optimization is a technique required to produce efficient code and it makes program to consume less memory and delivers high speed this optimization technique will be applied whenever it is needed.
  • Good thing that code optimization will do is reduces the time complexity of a program.

Issues to be considered while applying the optimization

There are mainly two issues:

  • Meaning of the source code should not be changed.

 

  • The efficiency of the source code must be gained without changing the algorithm. 

 

The code become inefficient due to two factors:

  1. Programmer
  2. Compiler

Machine Dependent or (Target machine in specific):

  • To produce target code, the target machines they use instruction set and addressing modes.
  • We perform machine dependent optimization once the target code gets generated, then it is transformed to target machine architecture.
  • Registers can involve and have memory references instead of relative references.
  • Machine dependent takes more advantage of the memory hierarchy.
  • Benefits will be heavily depending up on machine.
  • Perform optimization on critical parts of code.
  • By performing optimization, it will execute more quickly on the machine.
  • For example, consider the PDP-11 family machines they have autoincrement and autodecrement features for instructions.
  • These features will be used instead of using a constant value 1.
  • We can add or subtract this constant from the memory that is stored.
  • We can also say this as an optional activity in compilation.
  • This completes the compilation process.

Machine Independent (programming language):

 

  • Based on the characteristics of the programming language for applicable programming structure and usage ofarithmetic properties to reduce the execution time.
  • Mainly it optimizes the intermediate code so that the better target program can be generated.
  • For example:

X=Y*0 is the same as X=0

These optimization techniques are applied on source code based on scope of operation.

Local scope:

 

When a local optimization must be performed its scope will be limited to certain specific block of statement.

Global scope:

When a global optimization must be performed its scope is throughout the program or within some procedures. It will be performed by using data flow analysis.

  1. copy propagation:
  • It means use of one variable instead of another.
  • For instance, x = y is a statement which is called a copy statement then copy propagation is a kind transformation in which use y for x wherever possible after the copy statement x = y
  • Although this is not an improvement, but it will help in eliminating assignment to ‘n’.

  • Dead code elimination:
  • A variable is said to be live in the program if its value can be used subsequently otherwise it is treated as dead at that point.

This code will never get executed because it never satisfies i= =1.

 

Therefore, we call it as dead code or useless code.

 

Here ifstatement is a dead code as this never gets satisfied it can be optimized as i=0

 

  1. Constant folding:
  • It is a technique in which the constant expressions will be calculated during compilation.
  • It folds all the constants and will give an equivalent value.

  • Loop optimization is a technique in which we will perform optimization on inner loops where the program will spend large amount of time. By this execution of the program will increase the speed and improves the cache performance, consume less memory by doing loop optimization.
  1. Loop invariant:
  • Loop invariant is a technique which moves the code outside of the loop.

  1. Induction variables:
  • A variable x is called as induction variable of loop ‘L’ if the value of variable gets changed every time it will be either incremented or decremented by some constant.

 

Contributor's Info

Created:
0Comment
Three Address Codes | Analysis phase | Synthesis phase

INTRODUCTION:

The code which we write will be in high level language. It should be converted into low level machine understandable language. The role of conversion from high level to low level is done by the compiler.

Compilation is done in 2 phases.

  • Analysis phase
  • Synthesis phase

Analysis phase consists of 6 three phases.

  • Lexical analyzer
  • Syntax analyzer
  • Semantic analyzer

Synthesis phase consists of three phases.

  • Intermediate code generator
  • Code optimizer
  • Code generator

Lexical analyzer:

 In this phase the initial processing of the source code takes place. Each and every term is divided into a individual sequence of characters called as tokens. Every meaning full token is called as a lexeme.

Syntax analyzer:

It is the second phase in the analysis stage. It is also called as parser. In this expressions and statements are identified according to the output of the lexical analyzer.

Semantic analyzer:

Syntax analyzer produces syntax tree as a output. It does follows the structure but it doesn’t bother about the meaning of the individual. This role is carried out by the semantic analyzer.

Intermediate code optimizer:

In this the compiler will develop the intermediate code and later that is used for the development of the target code. It increases the efficiency of converting the source code into the target code.

Code optimizer:

This is the optional phase. If the intermediate code is not an optimized code then this phase will operates on the output if the intermediate code and turns it into an efficient code.

Code generator:

This is the final stage in the compiler. Code generation is a process of creating the assembly language or machine understandable language.

 

THREE ADDRESS CODE

What is three address code?

It is one which allows at most two operators which includes assignment operator by default (:=) and at most three operands.

General notation is Z : = X op Y

Where operator can be + or or * or /.

Three address code is one of the intermediate code representation in the phase of intermediate code representation. It is one of the low level representations in the intermediate code representation.

Types of three address code

  1. Assignment statement: x= z op y
  2. Copy statement: x= y
  3. Condition jump: if x<2 goto M
  4. Unconditional jump: goto M
  5. Procedure call: call p
  6. Indexed assignemt: x= y[i]
  7. Pointer assignment: x=*y

Representation of intermediate code

Representation of intermediate code is done in two different ways.

  1. High level representation.
  • Abstract syntax tree
  • Direct acyclic graphs
  • Flow graph

 

  1. Low level representation.
  • Postfix notation
  • Three address code

Three address code is again represented in three ways.

  • Quadruple
  • Triples
  • Indirect triples

Quadruple: It is structure which consists of four fields.

Op→ operator

Arg1→ operand1

Arg2→ operand2

Result→ used to store the temporary results of the expressions.

SNO

OP

ARG1

ARG2

RESULT

         

 Advantages: Results can be stored temporarily so that each expression results can be viewed.

Disadvantages: requires more space because of the storage of temporary variables

Triples: In this the usage of temporary variables is removed by referring to the pointers to the symbol table. It consists of three fields rather than four.

Op→ operator

Arg1→ operand1

Arg2→ operand2

SNO

OP

ARG1

ARG2

       

Advantages: compared with the quadruple notion it requires less space.

Disadvantages: the result  cannot be moved to the desired location.

Indirect triples: In this the pointers are matched with another set which are randomly assigned and are aliased with the original ones. It also consists of three fields.

 

Op→ operator

Arg1→ operand1

Arg2→ operand2

SNO

OP

ARG1

ARG2

       

 

SNO

ALIASING POINTER

   

 

Example for each notation of three addresses code representation:

Given expression x:= -a*b+-a*b

we have to convert the above expression in to three address code by following the operator precedence for accurate results.

Operator precedence (/, *, +, -).

Assign temporary variable for each of the expression say (t).

Step1: Take the expression given above and split into parts.

We have to take –a first in order to multiply with b.

So,

t1= -a

t2=t1*b

t3=-a

t4=t3*b

t5=t4+t2

x=t5 (assigning result to x)

Quadruple representation:

For each insertion (t) observe the operator and insert into the operator field operands into the Arg1 and Arg2 fields.

For example

  1. t1=-a;

           ( -)→ operator

           a → arg1

  1. t2= t1*b

(*) → operator

t1 → arg1

t2 → arg2

By proceeding like that we can easily represent the three address code in the quadruple notation.

SNO

OP

ARG1

ARG2

RESULT

1

-

a

 

t1

2

*

t1

b

t2

3

-

a

 

t3

4

*

t3

b

t4

5

+

t4

t2

t5

6

=

t5

 

x

 

Triple representation:

In this there is no need of the result field.

Assigning is same for all notations

x:= -a*b+-a*b

t1= -a

t2=t1*b

t3=-a

t4=t3*b

t5=t4+t2

x=t5 (assigning result to x)

.Instead of mapping with the operands directly we can map with the pointers in this triple notation.

Pointers {1,2,3,4,5,6}

For example: in 2  Arg1 will be –a and Arg2 will be b

But in place of arg1 we will use 1 instead of –a

 Like that for statement we will map with pointers instead of the operands.

SNO

OP

ARG1

ARG2

1

-

a

 

2

*

1

b

3

-

a

 

4

*

3

b

5

+

2

4

6

=

5

 

 

Indirect triple notation:

x:= -a*b+-a*b

t1= -a

t2=t1*b

t3=-a

t4=t3*b

t5=t4+t2

x=t5 (assigning result to x)

In this notation the pointers are aliased with another set of random set of pointers. The remaining process is same as like the triple notation.

For example

SNO

ALIASING POINTER

1

10

2

11

3

12

4

13

5

14

6

15

 

In this we will use aliasing pointers instead of original pointers.

In 2 we will use 10 instead of 1

In 4 we will use 12 instead of 3

In 5 we will use 11 and 13 instead of 2 and 4

In 5 we will use 14 instead of 5

SNO

OP

ARG1

ARG2

1

-

a

 

2

*

10

b

3

-

a

 

4

*

12

b

5

+

11

13

6

=

14

 

THREE ADDRESS CODE FOR PROGRAM STATEMENTS:

  1. simple if

if(a<b) then x=y+z

Three address code:

  1. if(a<b) goto i+2     (true case)

i+1)     goto i+4                (false case)

i+2)     t1=y+z                  

i+3)     x=t1

i+4)     goto

  1. while loop

while(a<b)

{

X=y+z

}

            For checking the conditions we have to use the simple if condition.

            Three address code:

  1. if(a<b) goto i+2     (true case)

i+1)     goto i+4                (false case)

i+2)     t1=y+z                  

i+3)     x=t1

i+4)     goto(i)                   (loop iterates upto to conditions gets false)

 

  1. Do while

Do

{
x-y+z

}

While(a<b)

Three address code:

  1. t1=y+z             (true case)

i+1)     x=t1                (false case)

i+2)     if(a<b) goto (i)                 

i+3)     goto                 (false case)

 

  1. switch case

switch(i+j)

{
case 1: x= y+z

Default: p=q+r

Case 2: u=z+w

}

Three address code: 

  1. t1=i+j

i+1)     t2=y+z                  

i+2)     x=t2

i+3)     t3=q+r

i+4)     p=t3

i+5)     goto

i+6)     t4=z+w

i+7)     u=t4

i+8)     if t1=1, goto(i+2)

i+9)     if t2=2, goto(i+7)

i+10)   goto(i+4)

  1. for

for(i=1;i<=20;i++)

{

X=y+z;

}

Three address code:

  1.       i=1

j+1)     if i<=20 goto(j+6)

j+2)     goto (i)                 

j+3)     t1= i+1

j+4)     i=t1

j+5)     goto(j+1)

j+6)     t2=y+z

j+7)     x=t2

j+8)     goto(j+3)

 

  1. 1 Dimensional array:

a [i]

Three address code:

  1. t1= i * width of array 

i+1)     t2= address (a)-width of array

i+2)     t3= t2[t1]

 

sample programs:

1) switch(a+b)

{
case 2: {x=y; break}

Case 5:{switch x

{
case 0: { a=b+1; break;}

Case 1:{a=b+3;break;}

Default : {a=2;break;}

}

Break;

Case 9: {x=y-1; break;}

Default:{a=2; break;}

}

Three address code:

Steps for designing the three address code for the above program ( nested switch)

We have to assign the internal switch condition (a+ b) to another variable.

Assign all the condition statements present inside each case and assign the result to the temporary variable.

Later check for each condition using the if condition and place goto block for each statement respectively.

 

i)         t1=a+b

i+1)     Goto()

i+2)     t2=y

i+3)     x=t2

i+4)     t3=x

i+5)     t4=b+1

i+6)     a=t4

i+7)     t5=b+3

i+8)     a=t5

i+9)     a=2

i+10)   t6=y-1

i+11)   x=t6

i+12)   Goto(i+9)

i+13)   if  t1=2; goto(i+3)

i+14)   if  t2=5; goto(i+4)

i+15)   if  x=0; goto(i+6)

i+16)   if x=1; goto(i+8)

i+17)   Goto(i+9)

i+18)   if t1=9; goto(i+11)

i+19)   goto(i+9)

Contributor's Info

Created:
0Comment