Exceptions and Continuations ---------------------------- In this class, we will continue to discuss some advanced control structures, namely exceptions and continuations, by scaling our language. First, we start by studying carefully a trivial language to set up the tools we'll use next. The syntax for this trivial expression language is: v -> n e -> v | e+e with only integer literals and additions which deserves no further explanation. And we can give it an operational semantics in a small-step style following the book (structural operational semantics, or SOS): e1 -> e1' ----------------------(E1) e1+e2 -> e1'+e2 e2 -> e2' ----------------------(E2) v1+e2 -> v1+e2' ----------------------(E3) v1+v2 -> v (where v=v1+v2) note that both rules E1 and E2 are standard search rules, and the rule E3 is the \beta rule which performs the real evaluation. Of course, it's easy to design a type system for this small language, I leave this as an exercise. We can give it a straightforward implementation, with SML code like: datatype e = Num of int | Add of e*e fun eval e = case e of Add (Num i, Num j) => Num (i+j) | Add (Num i, e2) => Add (Num i, eval e2) | Add (e1, e2) => Add (eval e1, e2) though this code is correct, it's rather inefficient, for it will repeatedly construct subexpressions which has nothing to do with the current evalutions, in other words, it will duplicate the evaluation context. To see the problem, you can try what will happen when evaluating a compound expression like: v1+v2+v3+...+vn. To cure this problem, we can use a lazy evaluation strategy by using an auxilary stack to cache the future evalutions. To achieve this, let's define the stack as follows: first, I give here the definition of the stack frame F F -> []+e | v+[] intuitively, a stack frame F is just an expression with a hole [] in it, that is, to obtain a complete expression, we can apply F to an expression e F[e] which looks very much like a function application. The intended use for the frame F is to record where we are evaluating a compound expression, and note the value v specifies a left-to-right evaluation order. Given the definition of frames, the stack S is simply a list of frames: S -> . | F::S where I use the notation . to stand for an empty stack and :: to stand for stack pushing with the frame F as the new stack top element. The stack is normally called a control stack. How to connect the control stack S with the operational semantics, the basic idea is to cache future computation in the stack. For this purpose, we can design an abstract machine M with the following configuration: M = (S, e) that is, this machine M is a pair consisting of a control stack S and an expression e being evaluated. With this machine model, the operational semantics rules take the following form: (S, e) -> (S', e') that is, we start with an initial control stack S and an expression e, evaluates to a new control stack S' and a new expression e' (which will be further evaluated). Here are the rules for the operational semantics, in a small-step style: -------------------------------------(E4) (S, v1+v2) -> (S, v) (where v=v1+v2) -------------------------------------(E5) (F::S, v) -> (S, F[v]) -------------------------------------(E6) (S, e1+e2) -> ([]+e2::S, e1) -------------------------------------(E7) (S, v1+e2) -> (v1+[]::S, e2) Basically, rule E4 is the \beta evaluation rule, and rule E5 pops a frame F from the control stack S and applies it to the current value v. Thus, F[v] will be the candidate for the next round of evaluation. Both the rule E6 and E7 are search rules, as above. The intial configuration for this abstract machine is (., e) whereas the final configuration is (., v) where v is the evaluation result for e. I encourage you to do some exercises to make sure you understand all the rules before continuing. The next question is how to implement these rules? The key job here is to encode frames and control stacks. At a first glance, we can encode them as a sum type: datatype frame = LeftAdd of e (* []+e *) | RightAdd of v (* v+[] *) type stack = frame list that is, to define a frame F, we have one dedicated constructor for each possible pisitions of the hole. A control stack is just a list of frames. However, this strategy has one serious drawback, for frame application F[v], we have to perform case analysis followed by constructor applications, this will become annoying when the possible pisitions of the hole increase. We can use an alternative solution which we have used to encode the lambda calculus: higher-order functions. Here is the definition: type frame = e -> e the key idea here is that each frame can be treated as a converter from an expression to another expression. (Note that the first e should be value, but I choose to write e here to make function composition easier, we'll come back to this topic later in this lecture.) Nevertheless to say, the key advantage is that we can avoid case analysis, but use a simple function call instead. I leave the implementation details as programming assignments. -------------------------------------------------- With the control stack, now I extend the language with exceptions: v -> n e -> v | e+e | throw | try e catch e the intended meaning of newly added constructs are: * throw: to throw some nameless expcetion without carrying any values; and * try e1 catch e2: first evaluates e1, if e1 does not throw, take e1's value as the returned value. If e1 does throw, then the evaluation of e1 aborts, and starts to evaluates e2. It's wirth recalling that "throw" is dynamically scoped. Before giving the operational semantics, first let's design a type system, the key idea is that "throw" should be of arbitary type so that it can appear in any expressions. With this in mind, we have the following types defintions: T -> Int | Any and typing rules: ------------------------------(T-Int) |- n: Int |- e1: Int |- e2: Int ------------------------------(T-Add) |- e1+e2: Int ------------------------------(T-Throw) |- throw: Any |- e1: T |- e2: T ------------------------------(T-Try) |- try e1 catch e2: T from the implementation point of view, the "Any" type should be equal to any other type. Now, let's turn to the operational semantics, we first extend the definition of frames with the syntactic struct for "try": F -> []+e | v+[] | try [] catch e which makes it clear that the hole can only appear in the try part of the try-catch expression. Now we formalize the operational semantics rules (with the only newly added rule shown here), ----------------------------------------------------(E-Try) (S, try e1 catch e2) -> (try [] catch e2::S, e1) ----------------------------------------------------(E-Throw) (S1@((try [] catch e2)::S2)), throw) -> (S2, e2) (S1 does not contain a try-catch) the E-Try rule simply installs a new frame onto the control stack S. The E-Throw rule is much more interesting: it will crawl down the control stack S and look for the first try-catch hole in the control stack and evaluate its catch subexpression. (You can think what will happen if there is NO such try-catch in the control stack S.) Traditionally, this technique is called stack unwinding in compiler literature. An expcetion may take values, to reflect this, I modify the language definition to the following one: v -> n e -> v | e+e | throw e | try e catch e note that now the "throw" takes an expression as argument e: the exception. Let's first consider the typing rules, in order to type the "throw" expression, we can dedicate a special type symbol "Exn", that is, the current type definitions are: T -> Int | Any | Exn The typing rules must take care for the expression associated with "throw": |- e: Exn --------------------------------(T-Throw) |- throw e: Any |- e1: T |- e2: Exn->T --------------------------------(T-Try) |- try e1 catch e2: T the T-Try specifies clearly that the expression e2 should of function type Exn->T which absorbs the exception values. It's not hard to give operational semantics to this language, I leave it as an exercise. ------------------------------------------------------------ There is another point of view for the control stack, we can treat it as function compositions, that is: F::S = F o S the key advantage of this view is that now the operational sematics rules become tail-recursions with functions as the second argument. For instance, the evaluation rules for addition now look like -------------------------------------------------- (S, e1+e2) -> ((\lambda x.x+e2) o S, e1) -------------------------------------------------- (S, v1+e2) -> ((\lambda x.v1+x) o S, e2) -------------------------------------------------- (S, v1+v2) -> (S, v) (where v=v1+v2) plus the normal stack application rule: -------------------------------------------------- (S, v) -> (., S(v)) the function argument S is called a continuation. Continuations have many nice properties which make it a powerful programming abstraction, both from theoretial and practical point of views. But what makes a continuation interesting is that we can make it a 1-class object by introducing it into the language. Let's extend our language with continuations: v -> n | cont(S) e -> v | e+e | callcc (\lambda k.e) | appcc e e The callcc(\lambda k.e) expression marks the current continuation with k and passes it to e; and the expression appcc e1 e2 takes a continuation e1 and calls it with the argument e2. Also note that we have a new value cons(S), it's worth remarking that S is a captured control stack and is only shown in the evalution rules (as we'll see in the next) so we don't bother to define S in the surface syntax. Also note that in the standard literature on continuations, the appcc is often called "throw e1 e2", however, I prefer to use the current syntax in order to leave the latter syntax to exceptions. First, let's give the static semantics to continuations. Here is the type definitions: T -> Int | Cont (T) where Cont(T) is the type of some continuation. With this in mind, we give the typing rules (for continuation-related): G |- S: T ------------------------------------ G |- cont(S): Cont(T) G, x: Cont(T) |- e: T ------------------------------------ G |- callcc (\lambda x.e): T G |- e1: Cont(T) G |- e2: T ------------------------------------ G |- appcc e1 e2: T And we can also extend the definition of frames: F -> ... | appcc [] e2 | appcc v [] and the operational semantics rules: ------------------------------------------------------(E-Cont) (S, callcc (\lambda x.e)) -> (S, [x|->cont(S)]e) ------------------------------------------------------(E-Appcc1) (S, appcc (cont(S')) v2) -> (S', v2) ------------------------------------------------------(E-Appcc2) (S, appcc v1 e2) -> (app v1 []::S, e2) ------------------------------------------------------(E-Appcc3) (S, appcc e1 e2) -> (appcc [] e2::S, e1) Essentially, callcc will duplicate and pass the current control stack to e, whereas appcc destroy the current stack and restore the saved old stack (thus transfer control to some alternate universe). The most appealing fact about continuations is that it serves as the functional counterpart of GOTO, using which advanced programming features can be built: non-local exit, generators, backtracking, coroutines, user-level thread, etc.