2015-02-12
Parsing (Ch. 4)

bottom-up parsing
  - builds a parse tree, starting at the leaves (Fig. 4.25)
  - simulates reverse rightmost derivation

shift-reduce parsers
  - implements bottom-up parsing efficiently
  - "shift" tokens, non-terminals, or states onto a stack
  - "reduce" stack by popping and pushing stack elements according to a
      grammar production
  - very efficient
  - can parse a proper superset of grammars vs. LL parsing
  - very tedious to write by hand
  - key questions: 1) when to reduce
                   2) which production to apply
  - actions: shift, reduce, accept, error

"handle" - body of a production on the stack (can be reduced)
  - ex. Fig 4.26 and Fig 4.28

conflicts
  - shift/reduce: usually resolve in favor of shifting (Ex. 4.38)
  - reduce/reduce: requires grammar modification (Ex. 4.39)

SLR: simple LR
  - augment the grammar with S' -> S
  - build item sets using CLOSURE operation
    - "kernel" items: dots in the middle (or right end)
    - "nonkernel" items: dots on the left end
  - construct LR(0) automaton
    - item sets represent states
      - one for each dot location in each production
      - remember to take the closure!
    - GOTO function defines transitions (i.e., moving the dot)
    - simulates an NFA that can read terminals and non-terminals
  - use LR(0) automaton for parsing
    - use a stack for tracking progress
    - create ACTION function/table for shift-reduce lookups
    - Alg. 4.44 / Fig. 4.36 / Fig. 4.37

Example: Fig. 4.31, Fig. 4.34, and Fig. 4.36
Example: Fig. 4.38
  - partial equivalence of Fig. 4.31 and Fig. 4.37 (no reductions in automaton)

Example:

   S -> a S b   (1)
      | a b     (2)

   I_0:  S' -> . S
      ~  S  -> . a S b
      ~  S  -> . a b

   I_1:  S' -> S .

   I_2:  S  -> a . S b
         S  -> a . b
      ~  S  -> . a S b
      ~  S  -> . a b

   I_3:  S  -> a S . b

   I_4:  S  -> a S b .

   I_5:  S  -> a b .

   FOLLOW(S) = { b, $ }

        a     b     $      S
     ------------------------
   0   s2                  1
   1               acc
   2   s2    s5            3
   3         s4
   4         r1    r1
   5         r2    r2

viable prefixes
  - prefixes of right sentential forms that can appear on the stack
  - must always be able to add terminals to a viable form to obtain a
    sentential form

SLR and LALR generate similar tables, but LALR accepts more grammars
Canonical LR generate much larger tables
GLR allows ambiguities and represents all possible valid parses
  SLR : GLR :: DFA : NFA

test cases for Decaf project
submission activated for PA2
  - lower point value than PA1 or subsequent PAs
  - more concerned that you are ready to move forward with full compiler

ANTLR is actually an adaptive LL(*) parser generator