Image

PA4 - Decaf Analysis

Goal

The goal of our semester-long project is to gain experience in compiler implementation by constructing a simple compiler. We will be implementing a compiler for a variant of the well-known Decaf language, which is a simple Java-like procedural programming language.

The goal of this part of the project is to implement static analysis routines for your Decaf compiler that will generate symbol tables and perform type checking.

Part 1 - Symbol Tables

For this part of the assignment, you will generate symbol tables for the AST produced by your front-end from the first two projects. You will need to re-download the AST specification file, which now has definitions for types and symbol tables. There is also a new version of the main driver.

DecafAST.py

DecafCompiler.py

Here is a template starter for this part of the project:

DecafAnalysis.py

A symbol table is a registry of bound symbols, representing fields, methods, or variables. Symbol tables could store many kinds of information, but for our purposes we only need to store type information. Every ASTProgram, ASTMethodDecl, and ASTBlock should have a table containing symbols declared at that level and their associated type.

There are three kinds of types: basic types (int, boolean), array types, and function types. Basic types are self-explanatory. Array types contain the base type and the size of the array. Function types contain the types of the function's input as well as its return (output) type.

For ASTPrograms, the symbol table should hold types for all fields and methods declared in that program. For ASTMethodDecls, the table should hold types for all parameters to that method. For ASTBlocks, the table should hold types for all variables declared in that block. Note that you will also need to add the index variable from for-loops to the symbol table for their child block.

Note that each table represents a static scope and should store a pointer to the table of its parent scope. For instance, a symbol table corresponding to a method declaration should contain a reference to the table of the parent program.

You should write a function build_symbol_tables that takes an AST object and a parent symbol table (which should be None for the initial call). This function should build a symbol table for each ASTProgram, ASTMethodDecl, and ASTBlock, according to the description above. The symbol table should be saved in the corresponding AST object, using the new symtable member varaible.

The build_symbol_tables function should call itself recursively for each child object that needs a symbol table.

You do not need to write any code to output the symbol table. I have provided that in the updated DecafAST.py file. Printing the AST will now also print any included symbol tables. As long as your code builds the symbol table correctly, it should be included in the standard output. See below for an example.

Part 2 - Type Checking

Once your symbol tables are being generated properly, you should use them to perform type checking, which is the process of verifying that all actual types match their expected types. Here is a list of things to check (note that this list is not exhaustive):

  • Program contains a main method that takes no parameters and has a void return type
  • No duplicate field or method names
  • No duplicate variables within a single scope
  • All field, local method, and variable references are valid in their scope
  • Left-hand side and right-hand side of assignments must have the same type
  • Only ints are used in math expressions
  • Only bools are used in conditional expressions
  • Only bools are used for if and while conditionals
  • Only arrays are used in location dereferences
  • Only ints are used as array indices
  • Array sizes must be greater than zero.
  • Method call names match defined methods (ignore callouts for now)
  • Actual method parameters match the method's formal parameters
  • break and continue statements only appear inside while or for loops
  • Range expressions for for loops must be ints

Write a function type_check that returns True if no type errors were found, and False if there were errors. Each error should be printed to standard output, one per line. There is no standard format for error messages, but they should include a short description of the problem, along with the expected and actual type information if applicable.

Sample Input

    // gcd.decaf

    class Program {
        int t;

        // uses euclid's algorithm
        int gcd(int a, int b) {
            while (b > 0) {
                t = b;
                b = a % b;
                a = t;
            }
            return a;
        }

        // program entry point
        void main() {
            t = gcd(24, 196);
            callout("printf", "%d\n", t);
        }
    }

Sample Output

    - Program
      SYMTABLE: { 't':<int>, 'gcd':<(<int>, <int>) -> <int>>, 'main':<() -> <void>> }
      - FieldDecl name=t type=int size=1
      - MethodDecl name=gcd type=int params={a:int, b:int}
        SYMTABLE: { 'a':<int>, 'b':<int> }
        - Block
          SYMTABLE:{  }
          - WhileLoop cond=(b>0)
            - Block
              SYMTABLE:{  }
              - Assignment loc=t expr=b
              - Assignment loc=b expr=(a%b)
              - Assignment loc=a expr=t
          - Return
            - Expr [loc]
              - Location name=a
      - MethodDecl name=main type=void params={}
        SYMTABLE: {  }
        - Block
          SYMTABLE: {  }
          - Assignment loc=t expr=gcd(24, 196)
          - MethodCall name="printf" [callout]
            - Literal type=string value="%d\n"
            - Expr [loc]
              - Location name=t

Other Inputs and Outputs

You can find more examples of input and output on the class server in the following folders:

/public/cs630/decaf/inputs

/public/cs630/decaf/outputs-analysis

Submission

PART 1 DUE: Friday, Mar 20, 2015, at 23:59 (11:59PM) EST

PART 2 DUE: TBD

Recommended file names (for Python):

    Decaf.g4                  (grammar, previous project)

    DecafLexer.py             (auto-generated using ANTLR4)
    DecafParser.py                         "
    DecafListener.py                       "

    DecafCompiler.py          (driver, provided above)
    DecafAST.py               (AST framework, provided above)
    DecafASTConverter.py      (converter, previous project)
    DecafAnalysis.py          (analysis routines)

For each part, submit a zip or tarball with your complete solution on Canvas. Your code must compile and work on the class server. Part of your grade will be based on automated testing and part will be based on style considerations. Please make sure that your project adheres to the specification exactly to avoid difficulties with the automated testing framework, and make sure that your code is clean and documented.