Lab 5: Optimizations

Lab Overview

In this lab, you'll add optimizations into your Tiger compiler, to make it generate better target code. There are 3 parts in this lab: in part A, you will write a bunch of optimizations on the abstract syntax tree: dead class eliminations, constant folding, algebraic simplifications, and dead code elimination, and so on. In part B, you will write a new intermediate representation: control flow graph (CFG), and you will write translator from C to CFG, and you will port your code generator and garbage collector to CFG. In part C of the lab, you will write several classic data-flow analysis on CFG: liveness analysis, reaching definitions, available expressions, and so on; and you will write several optimizations: constant propagation, copy propagation, dead code elimination, and common sub-expression elimination, etc..

Getting Started

First check out the source we offered you for Lab5:

  $ git commit -am 'my solution to lab4'
  $ git checkout -b Lab5 origin/Lab5

these commands will first commit your changes to the Lab4 branch of your local Git repository, and then create a local Lab5 branch and check out the remote Lab5 branch into the new local Lab5 branch.

Again, you will need to merge your code from Lab4 into the new Lab5 branch:

  $ git merge Lab4

Don't forget to resolve any conflicts before committing to the local Lab5 branch:

  $ git commit -am 'lab5 init'

You should first import the new Lab5 code into your Eclipse editor, and make sure the code compiles. There are a bunch of new files that you should browse through:

  ast.optimizations.*         various optimization on the AST
  cfg.*                       control-flow graph on optimizations on it

Hand-in Procedure

When you finished your lab, zip you code and submit to the school's information system.

Part A: Optimizations on the AST

Optimizations are program rewritings with the goal to make the optimized code faster, smaller, and so on. Optimizations can take different flavours: from imperative style to functional style and others. The optimizations you'll design and implement in this lab will make use of a functional style, in which a new program will be built (as the result of the optimization) without modifying the old program (the program being optimized). It'll be easier to write, debug, and maintain your code to use this style. Don't worry about what a functional style is, if you have not heard of it before. After this lab, you will get a deep understanding of the functional programming style and how to program in it (using OO).

Optimizations are highly dependent on the specific intermediate representation (IR) used: an optimization may be easy to perform on some kinds of IRs but difficult, if not impossible, on other kinds of IRs. Strictly speaking, as a high-level intermediate representation, abstract syntax trees are not a good IR for the purpose of optimizations, because the control/data flow information are obscure on an AST. As a result, only a few optimization are eligible for ASTs. However, it's not a bad idea for you to write optimizations on the AST as a starting point: first, it will give you some general ideas about how the functional-style optimization strategy work throughout this lab; second, the optimizations you will write on the ASTs, on some cases, will simplify the AST dramatically and thus make later phases of the compiler simpler.

Compiler Passes, Timing and Tracing

Your Tiger compiler has become much complex for now. And because you will add many optimizations in this lab, so it's very important to control the behaviors of your Tiger compiler and all these optimizations. For instance, optimization order are very important because some optimizations are effective only under a specific order. And furthermore, it's also very important to obtain and analyze the execution time of each optimization, so that you can control the tradeoff of optimizations (for example, you may want to drop or skip some optimizations, if it spends too much time).

For this, we have offered you, in lab 5 code, more command line arguments to control your Tiger compiler more handily. You can find the new arguments by running:

  $ java Tiger -

especially, you should pay attention to the -skip, -trace and -verbose arguments. The first one will skip some optimizations, and the second one will trace the execution of some optimizations, and finally the last one will show the compilation passes in detail.

As an example, read the code in the file ast.optimizations.Main.java, there is the top-level code for the 4 different optimizations you will write on ASTs: dead class elimination, dead code elimination, algebraic simplifications, and constant folding. For now, you don't need to read other files in the directory ast.optimizations, you only need to understand the relative order of these optimizations and how to control them.

Exercise 1. Be sure you can answer all the following questions:

what's the relative order of these optimizations? You can run
```
  $ java Tiger -testFac -verbose 1
```
to justify your answer. Or, you can run
```
  $ java Tiger -testFac -verbose 2
```
to check the running time of each optimization pass (among others).
is it a good idea to change the order of these optimizations? for instance, exchange dead class elimination and constant folding. Explain your conclusion.
suppose you observed that the constant folding pass is spending too much time, how can you drop this pass by executing some specific command line arguments? How can you skip all these 4 optimizations?

Dead Class Elimination

Any nontrivial Java project may contain many Java classes, however, not all classes are used in that project. So the primary goal of a dead class elimination optimization is to analyze which class will never be in the project and thus can be considered dead. By eliminating all the dead classes from the project, you can shrink the program size considerably. For instance, consider the test case test/DeadClass.java, the class Fac is a dead class and thus can be safely eliminated.

Exercise 2. Design an algorithm to perform dead class elimination. Analyze the computation complexity of your algorithm. Be careful of pitfalls, don't eliminate live classes.

You can use a work-list algorithm to perform dead classes elimination in a given Java program. First, introduce a work-list to hold all live classes. You can read the code in the package ast.optimization.DeadClass.java for the work-list algorithm. Initially, push onto the work-list the Main class which contains the static Java main method, because this class will never be dead. Everytime, if you pop a class A from the work-list and analyze the code of class A, push all classes used in A onto the work-list, repeat this process until the work-list becomes empty.

Exercise 3. Read the code in the package ast.optimization.DeadClass.java for the algorithm and code for dead class elimination. Fill in missing code in that file. Especially, you will use a functional style in writing your optimization algorithms, the key feature of this style is you will build new syntax trees from the old ones without modifying or destroying the old ones.

Don't forget to do regression testing. If you are not sure whether or not your optimization works properly, you can trace your optimization using the -trace argument, as in:

  $ java Tiger -testFac -trace ast.DeadClass

Dead Code Elimination

Besides dead classes, there may be other forms of dead code in a given Java program. For instance, consider the test case test/DeadCode.java, there is some dead code in it (which?). So the purpose of a dead code elimination is to remove code that can be statically proved to be dead.

Exercise 4. Design an algorithm to perform dead class elimination. Analyze the computation complexity of your algorithm. And then implement your algorithm in the file ast.optimizations.DeadCode.java.

Again, do regression testing before continuing.

Algebraic Simplification

The purpose of an algebraic simplification is to simplify program phrases, according to some algebraic rules. For instance, in the test case test/AlgSimp.java, you can find one such instance: the code x*0 can be simplified to 0.

Exercise 5. Design an algorithm to perform algebraic simplifications. Analyze the computation complexity of your algorithm. And then implement your algorithm in the file ast.optimizations.AlgSimp.java.

Remember to do regression testing before continuing.

Constant Folding

The basic idea of constant folding is to compute the values of an expression statically at compile time. Consider the test case test/AlgSimp.java, after algebraic simplification, the if condition expression becomes 0<2, obviously, it can be further simplified to true.

Exercise 6. Design an algorithm to perform constant folding. Analyze the computation complexity of your algorithm. And then implement your algorithm in the file ast.optimizations.ConstFold.java.

To this point, you have finished all these 4 different optimizations on ASTs. However, as you may have noticed that one optimization will make other ones possible. For instance, after the constant folding optimization, the if condition may become a constant, so it's a good opportunity for you to do another round of dead class elimination, dead code elimination, and so on.

Exercise 7. The code base in the file ast.optimizations.Main.java only performs one round of optimizations. Try to modify this code to optimize an AST as deep as possible. Think carefully of your design and write elegant code.

Part B: Control-flow Graph (CFG)

A control-flow graph (CFG) is a graph-based program representation with basic blocks as nodes and control transfers between blocks as edges. A control-flow graph is a good candidate intermediate representation for it makes optimizations easier to perform. In this part of the lab, you will construct the control-flow graph representation of a given program, and write various optimizations on CFGs.

Control-flow Graph (CFG)

You will first familiarize yourself with syntax trees for the control-flow graph. We have offered you most of the code for the control-flow graph in the package cfg.*. And you should read all the code first. Basically, the data structures for this CFG is very much like the C AST from lab 3: virtual method tables, structures (classes), main method, other (virtual) methods, and so on.

Exercise 8. Make sure you read the following code:

cfg.program.*
cfg.vtable.*
cfg.classs.*
cfg.mainMethod.*
cfg.method.*

You may refer to the corresponding code in the package codegen.C.*.

However, there is some key difference between the CFG and the C AST: in the C AST, a method body contains a sequence of statements; but in the CFG, a method body contains a sequence of basic blocks (you can see this from the method definition code in cfg.method.Method.java).

Basic Blocks (BBs)

So what are basic blocks? A basic block is a sequence of code that can only enter from the first instruction and exit from the last instruction. So, intuitively, you can only enter a basic block from the first instruction and executes each instruction contained therein in turn, until reach the last instruction; you can NOT enter or exit from the middle of a basic block.

Exercise 9. You can find the data structures defining a basic block in the code cfg.block.Block.java. A basic block contains three components: the block's name (a label), the containing statements, and a control transfer. Make sure you understand this code before continuing.

Three-address Code (TAC)

The statements in a CFG (code cfg.stm.*) are different from those in a C AST (code codegen.C.stm.*). The representation for the CFG statements makes use of three address code (TAC), a widely used compiler intermediate representation in many production compilers.

The name three-address code reflects one important property of this intermediate representation: in general, most statements contain at most three operands (addresses) (although variants of TACs may contain more or less addresses). For instance, consider the add statement defined in cfg.stm.Add.java, the syntax for this statement should be dst = left + right, in which both left and right are operands, and dst is the destination of this assignment to store the addition result. Furthermore, you can see that the types for both left and right are cfg.operand.T, which is either a variable or an integer literal. So if we treat an integer literal as a general form of address, there are totally 3 addresses. Besides the three addresses left, right, dst, there is also one operator +, so, for this reason, TAC is also called a 4-tuple in some literatures.

In the package cfg.stm, there are now a bunch of different definitions for statements: addition, virtual method invocation, less than, value movement, object allocation, printing, subtraction, and multiplication. Again, these statement are only enough to compile the test/Fac.java test case. In the remaining part of this lab, you will add more statements to compile the full MiniJava.

Exercise 10. Read all the code in the package cfg.stm and all code in the sub-packages in it. Make sure you understand all the code before continuing.

Now, you can compile the test case test/Fac.java:

  $ java -cp bin Tiger -testFac -dump C

this will produce a control-flow graph in a C file a.c, along with a C code from lab 3 part A in file a.c.c. Note that in the control-flow graph, there is the following line at the 4th line:

  // Control-flow Graph

Skim these files and compare them briefly.

Graph Visualization

It will be nice to visualize a control-flow graph, so it will be easier to analyze the graph. Fortunately, the Tiger compiler comes with this nice feature. Nevertheless to say, to draw a figure prettily is nontrivial and may lead to much programming effort, so your Tiger compiler will make use of the Graphviz software to visualize a control-flow graph. Graphviz is a very popular software for graph visualization (both in compilers and other fields).

Exercise 11. Download graphviz from its official download site, and install it on your machine. Don't forget to add it to your PATH.

To make sure you've installed graphviz correctly, you can run its command on your prompt:

  $ dot -?

will will print something like:

Usage: dot [-Vv?] [-(GNE)name=val] [-(KTlso)] 
(additional options for neato)    [-x] [-n]
(additional options for fdp)      [-L(gO)] [-L(nUCT)]
(additional options for memtest)  [-m]
(additional options for config)  [-cv]
...

Also, you can take a look at its manual, if you are interested in.

The Tiger compilers has rudimentary support for CFG visualization, that is, it can draw a graph for each method (function) in a program.

Exercise 12. To visualize a CFG for each method, run your Tiger compiler using the -visualize option. For instance, for the test/Fac.java test case, you can compile it with:

  $ java Tiger -testFac -visualize jpg

This command will draw two figures in jpeg format. In the jpeg file Fac_ComputeFac.jpg, you can see four blocks connected by directed edges. However, in the blocks, there are strings like:

Your code here:

you should add some code to substitute this line with real code in this block (you can refer to exercise 10 for the real code). Hint: the code you will modify resides in the file cfg.block.Block.java.

Don't forget to test your Tiger compiler. When in doubt, don't forget to use the -verbose option to control your Tiger compiler. And you will continue to modify this part of the code in later part of the lab.

CFG Generation

In principal, you can generate CFG from the Java AST directly, however, you will "reinvent the wheel" by doing so, because many hard work has been finished in part A of lab 3. So a better idea is to write the code generator starting from the C code from part A of lab 3.

You can find this code generator in the visitor cfg.TranslateVisitor.java, which traverses a C AST and generates a CFG AST.

Exercise 13. Finish this code generator by filling in code for all methods. For this, you may need to add more classes to package cfg.stm, and to modify the pretty printer cfg.PrettyPrinterVisitor.java, to modify the graph visualizing code, etc.. You should pay special attention to the cookBlocks() method in the file cfg.TranslateVisitor.java, which you may need to modify.

To this point, your Tiger compiler can compile all legal MiniJava programs to CFG-based intermediate representations and generate C code for them. Then generated C code can be further linked with runtime to produce final executables. Do regression test on your Tiger compiler. For now, don't worry about porting your Gimple garbage collector to this CFG, for you will do this in the next part of the lab.

Part C: CFG-based Optimizations

CFG-based program representations make program optimizations much easier, due to the fact that it's much easier to calculate the control flow and data flow information required to perform optimizations. Generally speaking, an optimization can be divided into two phases: program analysis and rewriting. During the program analysis phase (usually data-flow analysis), the compiler analyzes the target program to get useful static information as an approximation of the program's dynamic behaviour. And such information will be used in the second phase: program rewriting, the compiler rewrites the program according to the static information obtained in the first phase. As a result fo these two steps, the program is altered in some way---it is optimized. For instance, in the dead-code elimination optimization, the program analysis phase will determine whether or not the variable x defined by x = e will be used anywhere in this program; if x will not be used, the program rewriting phase will remove this assignment from this program (and the program shrinks).

In this part of the lab, you will write several classical program analysis: liveness analysis, reaching definitions, available expressions, and so on. And you will also write several classical optimizations (based on these analysis): constant propagation, copy propagation, dead code elimination, etc..

Liveness Analysis

You will write liveness analysis in this part of the lab. The liveness analysis you'll write calculate, for each statement and each transfer in a block, the live in and live out variable sets.

You will write your liveness analysis algorithm (in fact, most data-flow analysis algorithms) on a block granularity, so you should take a look at the general steps in such an algorithm:

data_flow()
  calculate the gen and kill sets for each statement and transfer;
  calculate the gen and kill sets for each basic block;
  calculate the in and out sets for each basic block;   // fix-point
  calculate the in and out sets for each statement and transfer.

For specific data-flow you will write next, you will need to modify this algorithm template to finish your work. Different data-flow algorithms will take minor differences. For instance, for the liveness analysis in this part of the lab, you should calculate live_in and live_out for the in and out separately; and the fix-point code should calculate the in and out sets for each basic block in a reverse topo-sort order. But for other kind of data-flow analysis, all these details may be different.

Exercise 14. Finish the liveness analysis code in the file cfg.optimizations.LivenessVisitor.java. We have offered you the code for the first step: calculation of the gen and kill sets for each statement and transfer in all blocks. Your job is to finish the other 3 steps. Don't forget to test your code immediately when you finish each step. For instance, if you want to test the first step on the test case Fac.java, you can run:

  $ java -cp bin Tiger -testFac -trace liveness.step1

Dead-code Elimination

A statement x = e is dead, if the variable x does not live out this statement. So, this statement can be safely removed.

Exercise 15. Finish the dead-code optimization in the file cfg.optimizations.DeadCode.java. You should pay special attention to the pitfalls mentioned in the Tiger book not to remove live statements. Also note that removing one dead code may make other live code dead, so make sure that your dead-code elimination optimization can eliminate ALL dead code.

To test your Tiger compiler, you can run your Tiger compiler on the monster program from lab 1, which contains many dead code.

Reaching Definitions

You will write the reaching definitions analysis in this part of the lab. Intuitively, the reaching definition will analyze which definition can reach which use.

Exercise 16. Finish the reaching definition analysis in the file cfg.optimizations.ReachingDefinition.java. This code is similar to the liveness analysis code, except for it performs the analysis in forward manner.

When in doubt, your Tiger should emit necessary tracing information.

Constant Propagation

Suppose, for any statement s of the form x = y or x = y op z, where y is defined in some statement t like y = c where c is a constant. And suppose the definition t is the unique definition that reaches the statement s. Then you can replace the variable y, in the statement s, with the constant c from the statement t.

Exercise 17. Finish the constant propagation code in the file cfg.optimizations.ConstProp.java. Note that to this point, the Tiger compiler will perform all these analysis and optimizations in a linear order, however, this is not the desired order for optimizations. So, you may want to modify the cfg.optimizations.Main.java code to use a fancy order.

Copy Propagation

Copy propagation is like constant propagation. Suppose, for any statement s of the form x = y or x = y op z, where y is defined in some statement t like y = m with a variable m. And suppose the definition t is the unique definition that reaches the statement s. Then you can replace the variable y, in the statement s, with the variable m from the statement t.

Exercise 18. Finish the constant propagation code in the file cfg.optimizations.ConstProp.java.

Available Expressions

Consider the statement x = y op z, if the expression y op z has been computed before this statement, then the expression y op z can reuse the computed result. Hence, you can compute whether or not the expression y op z is available here.

Exercise 19. Finish the available expression analysis in the file cfg.optimizations.AvailExp.java.

Common Sub-Expression Elimination (CSE)

For the statement x = y op z, if the expression y op z is available. We can substitute the expression y op z with the most recent definition using y op z. You can read the Tiger book section 17.3 for this algorithm.

Exercise 20. Finish the constant propagation code in the file cfg.optimizations.Cse.java.

Don't forget to use the -verbose to trace the execution time of each pass.

Safe Points and Precise GC

Liveness analysis has many important applications in compiler optimizations such as register allocations, and so on. But for this part of the lab, you will consider another kind of important application for liveness analysis: precise garbage collection.

Recall that, in lab 4, you have written the Gimple garbage collector. To help Gimple find the roots, the Tiger compiler will generate GC stack frames telling Gimple where to locate the roots. One key point to note is that the GC stack frames is a one-to-one correspondence to the call stack frames, that is, there is just one GC stack frame on one call stack frame (and thus one function).

Though simple to implement, this approach has serious drawbacks, it can not collect garbage precisely. Consider one challenge problem from the previous lab:

  A f (){
    A x, y;
    x = new A();
    // suppose that Java heap is out of space here...
    y = new A();
    return y;
  }

suppose that after allocating object x, there is no space in the Java heap to continue to allocate y, so the Gimple collector should take over. However, the GC stack frame would contain x so that the object pointed by x will NOT be collected. But a close look at the code will reveal that that object can indeed be collected because x will not live out the first statement.

In order to perform a much precise garbage collection, we can introduce the concept of safe points: positions which are safe to emit GC stack frames. Because the MiniJava language you are compiling is single-threaded, so the safe points are function invocations (including constructor invocations). The basic idea is to emit different GC stack frames before each function invocation, and each GC stack frame only contains the live references at that point (that is: the live in set of that function invocation). To better understand this, consider the code fragment above, you can generate two different GC stack frame for the two method invocations, like this:

  A f (){
    A x, y;
    // generate the first frame (empty):  {}
    x = new A();
    // generate the second frame: {y}
    y = new A();
    return y;
  }

As you can see, the reference x will not appear in any GC frame at all.

Exercise 21. Port your Gimple garbage collector to control-flow graph to do precise garbage collection. You should calculate the live in set for each method invocation and generate GC frame based on that.

Exercise 22. In the above code, the variable x is never used. So can we do dead code elimination to remove the assignment statement to x?

Handin

This completes the lab. Remember to hand in your solution to the information system.