Symbol Table

Symbol Table:

A compiler uses a symbol table to keep track of scope and binding information about names. The symbol table is searched every time any name is encountered in the source text. Changes to the table occur if any new name or new information about an existing name is discovered.

Definition: Symbol Table is an important data structure created and maintained by the compiler in order to keep track of semantics of variable i.e. it stores information about the scope and binding information about names, information about instances of various entities such as variable and function names, classes, objects, etc.

  • The information is going to be collected by analysis phase and used by syntheses phase.
  • A symbol table mechanism must allow us to add new entries and find the existing entries efficiently.
  • It is useful for a compiler to be able to grow the symbol table dynamically, if necessary at compile time. If the size of the symbol table is fixed when the compiler is written, then the size chosen must be large enough to handle any source program that might be presented.

What should be in a symbol table?

  •  variable names
  •  procedure and function names
  • literal constants and strings
  •  source text labels

What information might compiler need?

  • textual name
  • data type
  • dimension information (for aggregates )
  • declaring procedure
  •  lexical level of declaration
  • storage class (base address )
  • offset in storage
  • if the record, a pointer to structure table
  • if parameter, by-reference or by-value?
  • can it be aliased? to what other names?
  • number and type of arguments to functions.

  Various Implementation techniques of symbol table:-

1. Unordered list

  •  Use an array or linked list.
  •  O(n) time search.
  •  O(1) time insertion.
  •  Simple & compact, but too slow in practice.

2. Ordered list

  •   O(log n) time with a binary search. 
  •   O(n) time insertion.
  •   Simple, good if the tables are known in advance. 

3. Binary search tree

  •   O(log n) time expected to search.
  •   O(n) worst-case.
  •   O(log n) time expected insertion.
  •   a less simple approach.

4.Hash table

  •   O(1) time expected for search.
  •   O(1) time expected for insertion.
  •   worst case is very unlikely. 
  •   subtle design issues, more programmer effort.
  •   this approach is taken in most compilers.