Image

PA5 - Decaf Code Generation

Goal

The goal of our semester-long project is to gain experience in compiler implementation by constructing a simple compiler. We will be implementing a compiler for a variant of the well-known Decaf language, which is a simple Java-like procedural programming language.

The goal of this part of the project is to build a three-address code (TAC) generator for Decaf ASTs.

Description

As before, you will need to re-download the AST specification file, which now has definitions for three-address code constructs. There is also a new version of the main driver.

DecafAST.py

DecafCompiler.py

Here is a template starter for this part of the project:

DecafCodeGen.py

Three-address code (TAC) is a convenient format for representing machine-independent instructions. Here are the supported operations for our version of TAC:

    TACInsn
        dest = op src1              (unary op)
        dest = src1 op src2         (binary op)

    TACCopy
        dest = src                  (copy)
        dest[idx] = src             (indexed copy)
        dest = src[idx]             (   "     "  )

    TACLabel
        lbl:                        (label)

    TACJump
        goto lbl                    (unconditional jump)
        if var goto lbl             (conditional jump)

    TACParam
        param var                   (pass variable parameter)
        param lit                   (pass literal parameter)

    TACCall
        call lbl                    (call function)
        dest = call lbl             (call function w/ return value)

    TACRet
        ret                         (function return)
        ret var                     (function return w/ value)

The updated version of DecafAST.py provides definitions for a range of TAC objects. The forms above are grouped by their corresponding TAC object from the file. Read this code and its documentation carefully and verify that you understand each class's purpose.

Most of the TAC classes are merely structures for holding instruction data; however, the TACProgram class does a bit more. It represents a sequence of TAC instructions and labels (providing list-like access with the len function and indexing using bracket notation), but it also provides several utility methods that you will find helpful while you complete this project:

  • gen(insn)
    Appends an instruction onto the list of instructions.
  • newAnonLabel()
    Generates a new "anonymous" label; i.e., a label without a user-specified name. This is useful for creating jump targets for Decaf code structures like conditionals and loops. Returns a TACLabel with a name that includes a tag and a monotonically-increasing number.
  • newTempVar()
    Generates a new temporary variable; i.e., a variable without a user-defined name. This is crucial for a variety of code generation applications. Returns a TACVar with a name that includes a tag and a monotonically-increasing number.
  • pushLoop(continueLabel, breakLabel)
    popLoop()
    You can use the ASTProgram to track continue/break labels while you are in a loop. Simply call the pushLoop function with the labels when you are ready to recursively emit the body of the loop, then call popLoop when you are done.
  • currentContinueLabel()
    currentBreakLabel()
    If you have used pushLoop and popLoop properly, you can use these functions to retrieve the correct labels when generating code for continue or break statements.

Another important thing to note is the difference between TACVar and TACIdxVar. The former represents a scalar variable, while the latter represents an indexed (array) variable. For simplicity, indexed variables may only be used in simple copy instructions (TACCopy), and only one may appear per instruction.

You should write a function generate_tac that takes an AST object (with pre-generated symbol tables). This function should generate TAC for all of the methods in the program, adding all of the instructions in order to a TACProgram object and returning it.

As outlined in the template file, I recommend creating two recursive functions gen_tac and gen_tac_val. The former should handle levels of the AST that do not require returning results in a temporary variable (e.g., ASTMethodDecl, ASTBlock, ASTAssignment, ASTConditional, ASTWhileLoop, etc.), while the latter should handle levels that do (e.g., ASTExpr, ASTLocation).

Sample Input

    // gcd.decaf

    class Program {
        int t;

        // uses euclid's algorithm
        int gcd(int a, int b) {
            while (b > 0) {
                t = b;
                b = a % b;
                a = t;
            }
            return a;
        }

        // program entry point
        void main() {
            t = gcd(24, 196);
            callout("printf", "%d\n", t);
        }
    }

Sample Output

    gcd:
    @l0:
        't0 = b > 0
        't1 = ! 't0
        if 't1 goto @l1
        t = b
        't2 = a % b
        b = 't2
        a = t
    @l1:
        ret a
    main:
        param 24
        param 196
        't3 = call gcd
        t = 't3
        param "%d\n"
        param t
        call "printf"
        ret

Other Inputs and Outputs

You can find more examples of input and output on the class server in the following folders:

/public/cs630/decaf/inputs

/public/cs630/decaf/outputs-codegen

Submission

DUE: Friday, Apr 3, 2015, at 23:59 (11:59PM) EST

Recommended file names (for Python):

    Decaf.g4                  (grammar, previous project)

    DecafLexer.py             (auto-generated using ANTLR4)
    DecafParser.py                         "
    DecafListener.py                       "

    DecafCompiler.py          (driver, provided above)
    DecafAST.py               (AST framework, provided above)
    DecafASTConverter.py      (converter, previous project)
    DecafAnalysis.py          (analysis routines, previous project)
    DecafCodeGen.py           (TAC generation routines)

Submit a zip or tarball with your complete solution on Canvas. Your code must compile and work on the class server. Part of your grade will be based on automated testing and part will be based on style considerations. Please make sure that your project adheres to the specification exactly to avoid difficulties with the automated testing framework, and make sure that your code is clean and documented.