Updated at Feb 2, 2017

This implementation is based on theory behind Template Haskell [1]. See full implementation here.

After about 6 hours’ hacking, I finally get something working on some basic examples. I have to commit that it is very difficult – I was almost writing code only by intuition! Next, I am going to go over my implementation, and maybe a more formal proof about its correctness.

Basics

The lambda calculus is the simply typed one. No polymorphism, not even “let” binding. The focus is on the template system. It is very similar to the Haskell one, for the following characteristics:

  1. Splice operator $: template expansion at call-site
  2. Quasi-Quote [| ... |]
  3. Type-safety inside template
  4. AST construction functions, including genstr, TmAbs, TyInt et cetera

The examples in src/Example.hs:

-- tm1
$((\(s : Int -> TyQ) -> (s 1)) (\(i: Int) -> [| $i + $i |]))

-- tm2
$((\(s : String) -> TmAbs s TyInt (TyVar s)) genstr)

Inner Working

I only implement the stage-1, i.e. type checking before expansion and the expansion itself. However, the algorithms in stage-1 can be easily adapted as stage-2 type checking or interpreting of common code.

The stage means that:

    _______ with meta function ____ ___________ common code _______
    |                              |                              |
typeCheckStage1 term  ->  compileStage1 term -> typeCheckStage2 term -> Backend

In stage 1, the type checking of splice will always return TyBottom, which means that we don’t know its type until it is expanded. However, the meta-expression inside splice must be typed as tyQ itself. The main source of TyQ is quasi-quoting and AST construction functions.

The compileStage1 (i.e. compile in src/Compiler.hs is like a state machine. The Compile stage simply goes through the code and find meta call-sites. When entering the meta call-site, it moves to the Evaluate stage (i.e. evaluate), which will try to evaluate the meta-function. If then it encounters bracket expression, it will change to Compile stage again; or if it encounters TmTm expression (i.e. AST construction expression in user-level), it will go to evaluateTm.

The compile time environment is composed of a name generator and a name to expression binding. The bindings are used to store built-in functions and compile-time variable bindings.

While we have already used substIn to instantiate the bindings, why bother adding value bindings to the environment state? It is because, when we are using substIn, only compile-time structure will be inspected; the $i in brackets is ignored. Only with additional state, can bindings be passed from outer-splicing scope to inner-splicing scope.

Correctness

The first problem is how to define “correctness” here?

  • The compile-time semantics model
  • If the stage 1 type checking is passed, then there should be no stage 1 compile-time exception.
  • No stage 1 exclusive stuff will still exist after expansion

expansion algorithm

$Comp: (Env, Term) \mapsto Term$:

Compile-time reduction algorithm

$Eval: (Env, Term) \mapsto Term$:

Syntax construct mapping algorithm

$EvalTm: (Env, TmTerm) \mapsto Term$

Type safety

The type safety can be categorized into two aspects:

  • Common ones
  • Meta ones

All the divergent cases stated as above (with $\bot$) are belonging to the second one. The common ones are the classical ones for simply typed lambda calculus.

Like the compilation, the type checking also depends on contexts. However, for the common cases, type checking code can be shared (see typeCheck). For meta-specific constructs, type checker behaves differently (see typeCheckBracket and typeCheckSplice).

So, these rules effective eliminates the meta undefined behaviours:

typeCheckBracket env (TmBracket _) = Left "TmBracket in bracket"
typeCheckBracket env (TmType _)    = Left "TmType in bracket"
typeCheckBracket env (TmTm _)      = Left "TmTm in bracket"
typeCheckSplice env (TmSplice tm)  = Left "TmSplice in splice"

One thing worth noting is that, the splice term inside bracket is typed as TyWildCard, meaning a type which can match anything. However, after expansion, it should not appear again. And to well-type the function application under the presence of such type, I come up with a hack-ish unify function.

Since the type system is pretty naïve, I will not do any formalization here.

Stage safety

Theorem 1: For any term input, after the stage 1 compilation, it should have no bracket, splice, TmTerm or TmType (i.e., is a common term).

Proof: Let’s look at $Comp$ rules, for COMP-APP, COMP-ABS, it can be inductively proven. For COMP-BRACKET, COMP-TMTERM, COMP-TMTYPE, they should not appear after type checking. For COMP-INT, COMP-STR, and COMP-VAR, it is the trivial case.

Now let’s see the non-trivial case: COMP-EVAL.

Theorem 2: For any term input splice, after stage 1 evaluation, it will become common term.

Proof: EVAL-APP, EVAL-VAR can be proven inductively; EVAL-STR, EVAL-INT, EVAL-TMTYPE are trivial cases; EVAL-SPLICE is eliminated by typing rules; EVAL-BRACKET can be proven using the theorem 1; EVAL-TMTERM will be proven later; EVAL-ABS can be reduced to substitution, which can be proven inductively.

Theorem #3: For any TmTerm, after evalTm, it will become common term.

Proof: EVALTM-VAR, EVALTM-VAR and EVALTM-STR are trivial cases; Other two can be proven inductively.

References

  • Sheard, T., & Jones, S. P. (2002). Template meta-programming for Haskell. ACM SIGPLAN Notices, 37(12), 60. http://doi.org/10.1145/636517.636528