PA4 - Decaf Analysis
Goal
The goal of our semester-long project is to gain experience in compiler implementation by constructing a simple compiler. We will be implementing a compiler for a variant of the well-known Decaf language, which is a simple Java-like procedural programming language.
The goal of this part of the project is to implement static analysis routines for your Decaf compiler that will generate symbol tables and perform type checking.
Part 1 - Symbol Tables
For this part of the assignment, you will generate symbol tables for the AST produced by your front-end from the first two projects. You will need to re-download the AST specification file, which now has definitions for types and symbol tables. There is also a new version of the main driver.
Here is a template starter for this part of the project:
A symbol table is a registry of bound symbols, representing fields, methods, or variables. Symbol tables could store many kinds of information, but for our purposes we only need to store type information. Every ASTProgram, ASTMethodDecl, and ASTBlock should have a table containing symbols declared at that level and their associated type.
There are three kinds of types: basic types (int, boolean), array types, and function types. Basic types are self-explanatory. Array types contain the base type and the size of the array. Function types contain the types of the function's input as well as its return (output) type.
For ASTPrograms, the symbol table should hold types for all fields and methods declared in that program. For ASTMethodDecls, the table should hold types for all parameters to that method. For ASTBlocks, the table should hold types for all variables declared in that block. Note that you will also need to add the index variable from for-loops to the symbol table for their child block.
Note that each table represents a static scope and should store a pointer to the table of its parent scope. For instance, a symbol table corresponding to a method declaration should contain a reference to the table of the parent program.
You should write a function build_symbol_tables
that takes an
AST object and a parent symbol table (which should be None
for the
initial call). This function should build a symbol table for each
ASTProgram
, ASTMethodDecl
, and ASTBlock
,
according to the description above. The symbol table should be saved in the
corresponding AST object, using the new symtable
member varaible.
The build_symbol_tables
function should call itself recursively
for each child object that needs a symbol table.
You do not need to write any code to output the symbol table. I have
provided that in the updated DecafAST.py
file. Printing the AST
will now also print any included symbol tables. As long as your code builds the
symbol table correctly, it should be included in the standard output. See below
for an example.
Part 2 - Type Checking
Once your symbol tables are being generated properly, you should use them to perform type checking, which is the process of verifying that all actual types match their expected types. Here is a list of things to check (note that this list is not exhaustive):
- Program contains a
main
method that takes no parameters and has avoid
return type - No duplicate field or method names
- No duplicate variables within a single scope
- All field, local method, and variable references are valid in their scope
- Left-hand side and right-hand side of assignments must have the same type
- Only ints are used in math expressions
- Only bools are used in conditional expressions
- Only bools are used for
if
andwhile
conditionals - Only arrays are used in location dereferences
- Only ints are used as array indices
- Array sizes must be greater than zero.
- Method call names match defined methods (ignore callouts for now)
- Actual method parameters match the method's formal parameters
break
andcontinue
statements only appear insidewhile
orfor
loops- Range expressions for
for
loops must be ints
Write a function type_check
that returns True
if
no type errors were found, and False
if there were errors. Each
error should be printed to standard output, one per line. There is no standard
format for error messages, but they should include a short description of the
problem, along with the expected and actual type information if applicable.
Sample Input
// gcd.decaf class Program { int t; // uses euclid's algorithm int gcd(int a, int b) { while (b > 0) { t = b; b = a % b; a = t; } return a; } // program entry point void main() { t = gcd(24, 196); callout("printf", "%d\n", t); } }
Sample Output
- Program SYMTABLE: { 't':<int>, 'gcd':<(<int>, <int>) -> <int>>, 'main':<() -> <void>> } - FieldDecl name=t type=int size=1 - MethodDecl name=gcd type=int params={a:int, b:int} SYMTABLE: { 'a':<int>, 'b':<int> } - Block SYMTABLE:{ } - WhileLoop cond=(b>0) - Block SYMTABLE:{ } - Assignment loc=t expr=b - Assignment loc=b expr=(a%b) - Assignment loc=a expr=t - Return - Expr [loc] - Location name=a - MethodDecl name=main type=void params={} SYMTABLE: { } - Block SYMTABLE: { } - Assignment loc=t expr=gcd(24, 196) - MethodCall name="printf" [callout] - Literal type=string value="%d\n" - Expr [loc] - Location name=t
Other Inputs and Outputs
You can find more examples of input and output on the class server in the following folders:
/public/cs630/decaf/inputs
/public/cs630/decaf/outputs-analysis
Submission
PART 1 DUE: Friday, Mar 20, 2015, at 23:59 (11:59PM) EST
PART 2 DUE: TBD
Recommended file names (for Python):
Decaf.g4 (grammar, previous project) DecafLexer.py (auto-generated using ANTLR4) DecafParser.py " DecafListener.py " DecafCompiler.py (driver, provided above) DecafAST.py (AST framework, provided above) DecafASTConverter.py (converter, previous project) DecafAnalysis.py (analysis routines)
For each part, submit a zip or tarball with your complete solution on Canvas. Your code must compile and work on the class server. Part of your grade will be based on automated testing and part will be based on style considerations. Please make sure that your project adheres to the specification exactly to avoid difficulties with the automated testing framework, and make sure that your code is clean and documented.