Compiling expressions
March 17, 2024

Playing around with compiling expressions to WASM.

Alright, so two weeks have passed, since I have written anything on here. Time to get going again.

My goal for 2024 is to implement a simple programming language that compiles to WebAssembly and that I can use to solve at least one Advent of Code problem.

This post is just a small note on where I am. More detailed posts on the lexer and parser will be coming in the next few weeks. Currently I am making baby steps in compiling an AST to WebAssembly instructions using the library wasm_encoder in Rust. More on that in the next few weeks as well.

But I am unsure of an elegant way to compile an expression like (+ 3 5) to WebAssembly.

Compiling

For now, let’s assume the abstract syntax tree is defined like this:

enum Expr {
    Number(f64),
    Symbol(String),
    List(Vec<Expr>)
}

That would mean the following piece of code

(/ (* 7 5) 2)

would be parsed into the following concrete syntax tree

Expr::List(vec![
  Expr::Symbol("/"),
  Expr::List(vec![
    Expr::Symbol("*"),
    Expr::Number(7.0),
    Expr::Number(5.0),
  ])
  Expr::Number(2.0),
])

Compiling it to WebAssembly should yield the following instructions.

f64.const 7
f64.const 5
f64.mul
f64.const 2
f64.div

If you are confused about these instructions, do not worry. They become more understandable, when we go on a brief tangent about a cool concept called stack machines.

Stack machines

As far as I understand, stack machines are machines, where temporary values live on a stack instead of in registers. Consider the compiled program from before, this time annotated with the values on the stack during execution.

            # []
f64.const 7 # [7]
f64.const 5 # [5, 7]
f64.mul     # [35] (7 * 5)
f64.const 2 # [2, 35]
f64.div     # [17.5]

As you can see, we start with an empty stack. const pushes a literal value on the stack (there are also variants for integers). mul pops the first two values from the stack, multiplies them and pushes the result back onto the stack. div does the same with division.

What confused me, was the order of arguments from the stack that div used. I was expecting, that the operands would be “filled in” from left to right and not from right to left. So, that

const a
const b 
div

would result in b / a and not in a / b. That would result in the following program, which I compiled the expressions to at first.

f64.const 2
f64.const 5
f64.const 7
f64.mul
f64.div

Additionally I thought this was reverse polish notation, but while researching for this post, I realized, that the actual compiled program was in reverse polish notation.

Back to compiling

So now, it seems to compile an expression like (+ 3 5) I essentially need to flip around the operator/function (3 5 +) and then just emit the corresponding instructions. I am just not yet sure, whether that is how it’s done and whether I should do this in a pass before emitting instructions or while emitting instructions.

Well, we’ll see.