2015-04-02 Overview of Optimization (Ch. 9.1) Data Flow Analysis (Ch. 9.2-9.5) optimization scopes - local (within basic block) - global/regional (between basic blocks, within functions) - interprocedural (between functions) local optimizations - local value numbering (LVN) (EAC 8.4.1, 6.1.2) - tree-height balancing (EAC 8.4.2, 8.10.1) regional/global optimizations - global common subexpression elimination (9.1.4) - copy propagation (9.1.5) - dead code elimination (9.1.6) - code motion (9.1.7) - strength reduction (9.1.8) - superlocal value numbering (SVN) (EAC 8.5.1) - loop unrolling (EAC 8.5.2, 10.4.5) - finding uninitialized variables (EAC 8.6.1) (error checking) interprocedural optimizations - inline substitution (EAC 8.7.1) - interprocedural constant propagation data-flow analysis - based on control-flow graphs (CFGs) - computes functions over CFG nodes - basic concept: collapse potentially-infinite code execution paths - need to be conservative - sound analysis: all errors are reported (no false negatives, but false positives are possible) - complete analysis: all reported errors are real (no false positives, but false negatives are possible) - solving constraints: transfer functions - IN(b) and OUT(b) for each block b - often a composition of IN(s) and OUT(s) for each statement s - use union ("meet") operator to collapse multiple flow paths - usually make use of other functions (gen, kill, def, use, etc.) - PRED(b) = { all predecessors of block b } - SUCC(b) = { all successors of block b } - forward vs. backward - forward: IN(b) = f_pred(OUT(p)) OUT(b) = f(IN(b)) - backward: OUT(b) = f_succ(IN(s)) IN(b) = f(OUT(b)) - simple algorithm: iterate until convergence - forward: Alg. 9.11 and Fig. 9.14, general case in Fig. 9.23(a) - backward: Alg. 9.14 and Fig. 9.16, general case in Fig. 9.23(b) - in practice usually requires <5 iterations - represent sets by bit vectors - all operations are logical integer operations - smarter algorithm: reverse postorder (RPO) - visit as many of a node's predecessors as possible before itself dominators (9.6.1, EAC 9.2.1) - a node d "dominates" another node n if every path from the entry to n goes through d (every node dominates itself) - "proper" dominator (excludes nodes from dominating themselves) - example from EAC OUT(entry) = { } OUT(others) = { all blocks } IN(b) = ∩_{p ∈ PRED(b)} OUT(p) OUT(b) = {b} ∪ IN(b) live variables (9.2.5, EAC 9.2.2) - a variable is "live" if it could be used at some future point DEF(b) = { all variables assigned a value in b prior to their use } USE(b) = { all variables used in b prior to their definition } IN(exit) = {} IN(others) = {} OUT(b) = ∪_{s ∈ SUCC(b)} IN(s) IN(b) = USE(b) ∪ (OUT(b) - DEF(b)) reaching definitions (9.2.4, EAC 9.2.4) - a definition is "downwards exposed" from a basic block if it is visible immediately after the block - a definition is "killed" by a subsequent definition to the same variable GEN(b) = { all definitions that are downwards exposed in b } KILL(b) = { all definitions potentially killed by b } OUT(entry) = {} OUT(others) = {} IN(b) = ∪_{p ∈ PRED(b)} OUT(p) OUT(b) = GEN(b) ∪ (IN(b) - KILL(b)) available expressions (9.2.6, EAC 9.2.4) - an expression is "invalidated" when any variables in the expression are re-assigned - requires special initialization for OUT(b) in the iterative algorithm EGEN(b) = { expressions calculated and not subsequently invalidated in b } EKILL(h) = { expressions invalidated in b } OUT(entry) = {} OUT(others) = { all expressions } IN(b) = ∩_{p ∈ PRED(b)} OUT(p) OUT(b) = EGEN(b) ∪ (IN(b) - EKILL(b)) summary table (Fig. 9.21) semilattice theory - meet operator (∧) forms a partial ordering (greatest lower bounds) - special element ⊤ (top) and ⊥ (bottom) - ∀x. ⊤ ∧ x = x - ∀x. ⊥ ∧ x = ⊥ - example in Fig 9.22 (∧ = ∪) generalization: - domain - direction (forwards or backwards) - transfer function - boundary conditions - meet operator (∩ or ∪) - initialization for iterative algorithm FORWARDS BACKWARDS UNION reaching definitions live variables used expressions INTERSECTION dominators anticipated expressions available expressions postponable expressions constant propagation (9.4, EAC 9.2.6) - data-flow with carefully-constructed semilattice - sparse simple constant propagation (SSCP) partial-redundancy elimination (9.5, EAC 10.3.1) - four separate data-flow passes - lazy code motion (LCM) static single assignment (SSA) form (6.2.4, EAC 9.3)