The organization of CVM is simply a tuple:
CVM -> (store, prog)where
prog
is defined in programming assignment #2. And
the store
is a mapping
store : id -> numwhich maps identifiers
id
to integers num
.
Two useful operations could be defined on a store st
:
st (id) st [id |-> n]the first one looks up and returns the integer value of
id
in the store st
. And
the second one updates the value of indentifier id
with
integer n
in store st
, and if id
does not belong to the domain of st
, this operation simply
adds the binding from id
to n
into st
.
With all the above notations, we can define a bunch of relations to .
The first relation (st, e) -> v
is given by a function
E(st, e) = v
which classifies the
execution of expressions. The execution of
an expression e
on some store st
always returns
e
's value v
(an integer). The detailed rules are presented
below, and are inductively
defined according to the structure of e
.
E(st, num) = num E(st, id) = st (id) E(st, e1+e2) = E(st, e1) + E(st, e2) E(st, e1-e2) = E(st, e1) - E(st, e2) E(st, e1*e2) = E(st, e1) * E(st, e2)
The second relation (st, s) -> st'
is given by a function
S(st, s) = st'
which classifies the
execution of statements. The execution of
a statement s
on some store st
always returns
a new store st'
. The detailed rules are presented
below, and are inductively
defined according to the structure of s
.
S(st, id=e) = st [id |-> E(st, e)] S(st, print(e)) = st // with side effect of printing E(st, e) on the screen terminated with a new line
The third and last relation (st, prog) -> st'
is given by a
function P(st, prog) = st'
which classifies the
execution of a whole program. The execution of
a program prog
on some store st
always returns
a new store st'
. The detailed rules are presented
below, and are inductively
defined according to the structure of prog
.
P(st, stm prog) = P(S(st, stm), prog) P(st, stm) = S(st, stm)
The PVM is also a tuple:
PVM -> (store, prog)where
prog
is a miniPentium program and the definition of
store
is the same as the above problem.
Also, the execution of PVM is specified by a collection of relations. The first
relation (st, v) -> n
is given by function V(st, v) = n
which formalizes the execution of operands v
on some store
st
.
V(st, num) -> num V(st, id) -> st (id)
The second relation specifies the execution of an instruction
, and
is given by a relation (function) I
.
I(st, movl v, id) -> st [id |-> V(st, v)] I(st, addl v, id) -> st [id |-> V(st, id) + V(st, v)] I(st, subl v, id) -> st [id |-> V(st, id) - V(st, v)] I(st, mull v, id) -> st [id |-> V(st, id) * V(st, v)] I(st, print v) -> st // with side effect of printing V(st, v) on screen terminated with a new line
The third and final relation (function) P
specifies
the execution of prog
.
P(st, instruction prog) -> P(I(st, instruction), prog) P(st, instruction) -> I(st, instruction)
In this problem, our job is to design and implement a small compiler "miniVC" which compiles miniC into miniPentium. This task can be divided into several sub-problems:
P, S, E
.
The function P
is defined on miniC prog
. Enssentially,
it translates each statement in prog
in turn.
P(stm prog) = S(stm) P(prog) P(stm) = S(stm)The function
S
is defined on statements stm
.
It translates each
miniC statement into a series of miniPentium instructions.
S(id = exp) = E(exp) movl result, id S(print(exp)) = E(exp) print resultThe function
E
is defined on expressions exp
.
This function has two effects: first, it translates each
miniC expression into a series of miniPentium instructions, and meanwhile it
puts the value of this expression into a special variable
result
.
E(num) = movl num, result E(id) = movl id, result E(e1 + e2) = E(e1) movl result, t1 E(e2) movl result, t2 addl t2, t1 movl t1, result E(e1 - e2) = E(e1) movl result, t1 E(e2) movl result, t2 subl t2, t1 movl t1, result E(e1 * e2) = E(e1) movl result, t1 E(e2) movl result, t2 mull t2, t1 movl t1, resultHint: you may choose to implement the
result
variable
as an external variable or as a return value of function E
.
x
is uninitialized:
y = x; print (y);And the execution of this program will either core dumped or print an arbitary junk value. To cure this, we may statically check whether there exist uninitialized variables in a program, and if so, we'd like to issue a friendly warning or error message to the programmer, instead of going into crazy core dumped at runtime. This process is called (a form of) type checking.
Our job is to design and implement a type checker to check uninitialized variables in miniC and miniPentium. We don't give formal checking here, it's your task to think careful what the proper rules will be.
x = 8; opt1 x = 8; opt2 x = 8; opt3 y = 9; ======> y = 9; ======> y = 9; ======> z = x + y; z = 8 + 9; z = 17; print (z); print (z); print (17); print (17);Opt1 is called contant propogation which substitutes variables with their specific values. And opt2 is called constance folding which calculates an expression's value, if this expression is composed of only constants (note that it's followed by a contant propogation again). And opt3 is called dead-code elimination which eliminates instructions that has no effects on program output.
(1) Your job is to implement all these optimizations on miniC and
miniPentium. Is it true that there will always be a print statement
left, after all such optimizations applied?
(2) Are there any other kind of optimizations that could be applied to
the miniC or miniPentium? If so, try to implement your ideas.
Which the above approach the GCC compiler and the MS VC6.0 compiler use? And why?
In this problem, our task is to design and implement a stack-based
virtual machine "SPVM".
(1) First we would like to revise the definition
of the miniPentium language to make it stack-based (call it stackPentium).
For this strategy
to work, we may revise the existing instructions add some new ones
to miniPentium. For instace, we may revise the definition of
instruction
to this one:
instruction -> addl -> subl -> mull -> print -> pushl v -> popl idNote that we've add two new instrutions
pushl, popl
.
And note some instructions have been revised heavily. For instance, the
addl
instruction now take no operand at all. And the original
addl v, id
instruction could be expanded:
addl v, id ======> pushl v pushl id addl popl idOther instructions could be revised in a siliar way.