2015-02-12 Parsing (Ch. 4) bottom-up parsing - builds a parse tree, starting at the leaves (Fig. 4.25) - simulates reverse rightmost derivation shift-reduce parsers - implements bottom-up parsing efficiently - "shift" tokens, non-terminals, or states onto a stack - "reduce" stack by popping and pushing stack elements according to a grammar production - very efficient - can parse a proper superset of grammars vs. LL parsing - very tedious to write by hand - key questions: 1) when to reduce 2) which production to apply - actions: shift, reduce, accept, error "handle" - body of a production on the stack (can be reduced) - ex. Fig 4.26 and Fig 4.28 conflicts - shift/reduce: usually resolve in favor of shifting (Ex. 4.38) - reduce/reduce: requires grammar modification (Ex. 4.39) SLR: simple LR - augment the grammar with S' -> S - build item sets using CLOSURE operation - "kernel" items: dots in the middle (or right end) - "nonkernel" items: dots on the left end - construct LR(0) automaton - item sets represent states - one for each dot location in each production - remember to take the closure! - GOTO function defines transitions (i.e., moving the dot) - simulates an NFA that can read terminals and non-terminals - use LR(0) automaton for parsing - use a stack for tracking progress - create ACTION function/table for shift-reduce lookups - Alg. 4.44 / Fig. 4.36 / Fig. 4.37 Example: Fig. 4.31, Fig. 4.34, and Fig. 4.36 Example: Fig. 4.38 - partial equivalence of Fig. 4.31 and Fig. 4.37 (no reductions in automaton) Example: S -> a S b (1) | a b (2) I_0: S' -> . S ~ S -> . a S b ~ S -> . a b I_1: S' -> S . I_2: S -> a . S b S -> a . b ~ S -> . a S b ~ S -> . a b I_3: S -> a S . b I_4: S -> a S b . I_5: S -> a b . FOLLOW(S) = { b, $ } a b $ S ------------------------ 0 s2 1 1 acc 2 s2 s5 3 3 s4 4 r1 r1 5 r2 r2 viable prefixes - prefixes of right sentential forms that can appear on the stack - must always be able to add terminals to a viable form to obtain a sentential form SLR and LALR generate similar tables, but LALR accepts more grammars Canonical LR generate much larger tables GLR allows ambiguities and represents all possible valid parses SLR : GLR :: DFA : NFA test cases for Decaf project submission activated for PA2 - lower point value than PA1 or subsequent PAs - more concerned that you are ready to move forward with full compiler ANTLR is actually an adaptive LL(*) parser generator