Conditionals
Now that we’ve implemented booleans, we can implement
if. Our if form looks like this:
(if <test> <then> <else>)
An if expression evaluates to the
then expression if test evaluates to a
“truthy” value, and evaluates to the
else expression otherwise. Remember that in our language,
all values other than false are truthy!
What makes these conditional expressions different from
operations we’ve seen before is that we’ll need to evaluate
different expressions depending on the value of another expression. This
is easy in the interpreter–we’ll just use OCaml’s
if expression! In the compiler, we’ll rely on a
feature of x86-64 that we haven’t seen yet.
Conditionals in the interpreter
This part is pretty simple! We’ll just add a case to
interp_exp:
let rec interp_exp (exp : s_exp) : value =
match exp with
(* some cases elided... *)
| Lst [Sym "if"; test_exp; then_exp; else_exp] ->
if interp_exp test_exp = Boolean false then interp_exp else_exp
else interp_exp then_exp
And that’s it! The one thing to note here is that we only evaluate one of the two expressions.
Conditionals in the compiler
x86-64 doesn’t have “if” built in, but it does have a standard way of implementing conditionals: with conditional jumps.
So far, all of the assembly code we’ve seen is
straight-line code: we start executing instructions at the
entry label, and keep going until we get to
ret. We can write straight-line code in higher-level
languages too (and this code is generally pretty easy to compile to
assembly). Higher-level languages also have various constructs to
execute code conditionally, or more than once–things like
conditionals and loops and functions.
x86-64 machine code, like most machine codes, really only has one
way of writing non straight-line code: jumps. A jump instruction
lets us start executing from a label elsewhere in our program.
It’s what the runtime does to start executing from our
entry label.
A conditional jump lets us jump to another label depending on
the flags we talked about above. We’ll be using
jz <label>, which jumps to a given label
if and only if the ZF flag is set. So in order
to compile (if test then else), we’ll want
something like:
; code for the test expression
cmp rax, 0b00011111 ; compare to boolean false
jz else
; code for the then expression
jmp continue
else:
; code for the else expression
continue:
The “then” code is skipped when the test expression is
false, because of the
jz instruction. The “else” code is skipped
whenever we evaluate the “then” code, because of the
jmp instruction. Cool, right?
Our OCaml implementation follows that pseudocode:
let rec compile_exp (exp : s_exp) : directive list =
match exp with
(* some cases elided ... *)
| Lst [Sym "if"; test_exp; then_exp; else_exp] ->
compile_exp test_exp
@ [Cmp (Reg Rax, operand_of_bool false); Jz "else"]
@ compile_exp then_exp @ [Jmp "continue"]
@ [Label "else"]
@ compile_exp else_exp
@ [Label "continue"]
There’s one big problem here. What if we have more than one
if expression? Something like this:
(if (num? 4) (if (num? false) 1 2) 3)
Right now, our assembler is going to throw an error if we try to compile this program, something like:
program.s:17: error: label `_else' inconsistently redefined program.s:13: note: label `_else' originally defined here program.s:19: error: label `_continue' inconsistently redefined program.s:15: note: label `_continue' originally defined here
We’re using our label names more than once! That’s not
going to work. We’ll need to make sure that each
if expression has its own labels for
else and continue. We’ll use a
function called gensym1 in
order to generate unique label names. We can call gensym like this:
$ Util.gensym "else";; "else__0" $ Util.gensym "else";; "else__1" $ Util.gensym "continue";; "continue__2"
This function is very different from, say, our
compile_exp or interp_exp
functions: it returns a different output every time we call it!
(Indeed, its whole purpose is to return a different output every
time we call it.) It’s defined like this:
let gensym : string -> string =
let counter = ref 0 in
fun s ->
let symbol = Printf.sprintf "%s__%d" s !counter in
counter := !counter + 1 ;
symbol
The counter variable is what makes this function work.
counter is a reference to an integer; it works
sort of like a variable in a typical imperative language like Java
or Python. We can update its value with
counter := <new value> and read its value with
!counter. This little function is a good example of
idiomatic usage of references in OCaml: use references as little as
possible, and hide them in functions that do a specific thing. We
can’t update counter from outside this function.
Using our gensym function, we can complete our
if compiler:
let rec compile_exp (exp : s_exp) : directive list =
match exp with
(* some cases elided ... *)
| Lst [Sym "if"; test_exp; then_exp; else_exp] ->
let else_label = Util.gensym "else" in
let continue_label = Util.gensym "continue" in
compile_exp test_exp
@ [Cmp (Reg Rax, operand_of_bool false); Jz else_label]
@ compile_exp then_exp @ [Jmp continue_label]
@ [Label else_label]
@ compile_exp else_exp @ [Label continue_label]
Looking ahead
Today we introduced two new concepts in x86-64 machine code: flags and jumps. Next time we’ll implement binary operations, for which we’ll need one more concept: memory. After that, though, we really won’t be further complicating our model of how the processor executes. We’ll need a few more instructions here and there, but there won’t be any more big ideas at the assembly level. This will be a blessing and a curse: the way the processor executes is relatively simple and easy to understand, which means that compiling high-level language constructs like functions is pretty challenging! It’s going to be fun.
Footnotes:
Short for “generated symbol”; the name comes from early LISP implementations