Make a Lisp [5] Technical SITREP

28 Apr 2019

This is the fifth SITREP for my ongoing Make-a-Lisp project - a multi-step process that involves implementing JKL: an interpreter for a fully featured Lisp-like programming language. Prior posts include Background, Steps 0-4, Steps 5-8 and Step 9.

The bottom line is that I’ve completed this step successfully (with a couple of minor quirks), thus achieving the overall objective of this project, namely to create using C# a Lisp interpreter (JKL) that is sufficiently powerful to itself host a Lisp interpreter. Yay!

The source code is on GitHub.

Step A: overview

This last step is different from all of the preceding steps, each of which involves extending the core JKL language and then testing the extended interpreter using short programme fragments that exercise the added functionality. In contrast, the primary aim of this step is to give the JKL interpreter (which I’ve just written in C#) a proper workout by running a really complex Mal program - specifically an interpreter for the Mal language itself. For clarity, I’ll refer in this post to the Mal interpreter hosted on JKL as Hosted-Mal.

Rather than trying for a fully functional Hosted-Mal at the outset, Step A is itself approached incrementally, specifically by running successively more complex Hosted-Mal interpreters corresponding to each of the original Mal development steps. Each Hosted-Mal interpreter should be able to pass the tests associated with the original Mal development step. In other words, although I have already run (most of) these tests when developing JKL; I’m now running them again, but this time to test the Hosted-Mal interpreters running on top of JKL. Because the aim of this step is to give JKL a workout, and not to repeat the entire Make-a-Lisp process, I’m using pre-written Hosted-Mal source code taken from the Mal GitHub repository.

In addition to the above, Step A also involves adding a couple of extra core functions that support self-hosting. These were straightforward at this point, so I’ll cut to the chase.

Preparations

The logical first step was to try to use load-file to load the Hosted-Mal source code for Mal step 0. I also tried loading the Hosted-Mal core and env source code files. The functions in these files aren’t tested until the Step 4 tests, but I decided (correctly, as it turned out) that loading them here might identify possible show-stoppers.

This process raised the following issues:

This got me to the point where JKL could successfully run a Hosted-Mal interpreter to the point where Hosted-Mal would pass the Mal Step 0 test, as well as loading Hosted-Mal’s core and env files (testing the reading of but not the functions contained).

Steps 1 to 4

Pleasingly, the tests for steps 1, 2 and 3 then worked first time. The tests for step 4 passed with two exceptions:

In JKL, an empty list isn’t the same as an empty vector. I decided to check the CL semantics of equality, and discovered that equality is a veritable can of worms. For example, CL provides multiple equality predicates e.g. to distinguish between the cases where two symbols refer to the same underlying memory location, vs. the case where two things are ‘conceptually’ identical. The CL reference doesn’t explicitly address the case of comparing an empty list with an empty vector. Nor does my Clojure reference.

With these caveats in mind, I decided to continue testing.

(There are no explicit tests for Step 5, because this step adds Tail-Call Optimisation rather than new callable functions. However, if TCO is broken, most subsequent tests will fail.)

Step 6

The Step 6 Hosted-Mal source code initially failed to load (before I even reached the tests for it) because:

In Step 6, the string quoting issues manifest because Hosted-Mal source code internally defines a load-file function as follows:

(rep "(def! load-file (fn* (f) (eval (read-string (str \"(do \" (slurp f) \")\")))))")

This is a quoted string that itself contains a quoted string. Unfortunately JKL interprets this as a non-terminating string, and aborts loading the Step 6 Hosted-Mal source code. The problem is the use of the str command to wrap "(do " and ")" around the slurped file (so that all of the forms in the file will be processed).

The correct fix is to figure out how to properly quote strings in these literals. What I actually did was to create in JKL a slurp-do core function that adds the (do...) form before returning the slurped file so that the caller doesn’t have to. I then added this to the Hosted-Mal core file and edited the definition of load-file in the Hosted-Mal source code for Step 6 to use slurp-do. I’ll have to make the same edit to all of the other Hosted-Mal source files but it will let me proceed faster so see if there are other significant issues.

With these changes made, the Step 6 tests then passed.

Steps 7, 8 and 9

The tests for these steps mainly passed okay, once I’d fixed load-file and implemented the remaining deferred functions that I’d omitted the first time round.

The one additional fail related to hash-map equality. Specifically, for the test form

(= {:a 11 :b {:c 33}} (hash-map :b {:c 33} :a 11))

the correct result is true. However, JKL initially returned false because it sequentially compares the consecutive elements of the two hash-maps and immediately returns false as soon as two elements don’t match. However, from the perspective of the functions that use hash-maps, the two forms give the same results (e.g. the value of :b is {:c 33} in both) and thus should be considered identical, conceptually at least.

This problem is thus similar to the issue with equality of empty lists and vectors encountered in Step 4 testing. I fixed it by adding a more sophisticated EQ function that returns true if every key / value pair in A is also in B, even if the keys are in a different order. Note that this version of EQ doesn’t check for multiple instances of the same key in a hashmap.

Step A

This was the last set of tests. They passed fairly quickly, after a couple of fixes to resolve the string handling issue. But the real test at this point is to see whether the JKL interpreter can actually self-host, i.e. by running the code for Step A written in Mal itself. This code is as follows (with JKL-specific edits for the quoted string problem):

(load-file "F:\JKL\HostedMal\env.mal")
(load-file "F:\JKL\HostedMal\core.mal")

;; read
(def! READ (fn* [strng]
  (read-string strng)))

;; eval
(def! is-pair (fn* [x]
  (if (sequential? x)
    (if (> (count x) 0)
      true))))

(def! QUASIQUOTE (fn* [ast]
  (cond
    (not (is-pair ast))
    (list 'quote ast)

    (= 'unquote (first ast))
    (nth ast 1)

    (if (is-pair (first ast))
      (if (= 'splice-unquote (first (first ast)))
        true))
    (list 'concat (nth (first ast) 1) (QUASIQUOTE (rest ast)))

    "else"
    (list 'cons (QUASIQUOTE (first ast)) (QUASIQUOTE (rest ast))))))

(def! is-macro-call (fn* [ast env]
  (if (list? ast)
    (let* [a0 (first ast)]
      (if (symbol? a0)
        (if (env-find env a0)
          (let* [m (meta (env-get env a0))]
            (if m
              (if (get m "ismacro")
                true)))))))))

(def! MACROEXPAND (fn* [ast env]
  (if (is-macro-call ast env)
    (let* [mac (env-get env (first ast))]
      (MACROEXPAND (apply mac (rest ast)) env))
    ast)))

(def! eval-ast (fn* [ast env] (do
  ;;(do (prn "eval-ast" ast "/" (keys env)) )
  (cond
    (symbol? ast) (env-get env ast)

    (list? ast)   (map (fn* [exp] (EVAL exp env)) ast)

    (vector? ast) (apply vector (map (fn* [exp] (EVAL exp env)) ast))

    (map? ast)    (apply hash-map
                      (apply concat
                        (map (fn* [k] [k (EVAL (get ast k) env)])
                             (keys ast))))

    "else"        ast))))

(def! LET (fn* [env args]
  (if (> (count args) 0)
    (do
      (env-set env (nth args 0) (EVAL (nth args 1) env))
      (LET env (rest (rest args)))))))

(def! EVAL (fn* [ast env] (do
  ;;(do (prn "EVAL" ast "/" (keys @env)) )
  (if (not (list? ast))
    (eval-ast ast env)

    ;; apply list
    (let* [ast (MACROEXPAND ast env)]
      (if (not (list? ast))
        (eval-ast ast env)

        (let* [a0 (first ast)]
          (cond
            (nil? a0)
            ast

            (= 'def! a0)
            (env-set env (nth ast 1) (EVAL (nth ast 2) env))

            (= 'let* a0)
            (let* [let-env (new-env env)]
              (do
                (LET let-env (nth ast 1))
                (EVAL (nth ast 2) let-env)))

            (= 'quote a0)
            (nth ast 1)

            (= 'quasiquote a0)
            (let* [a1 (nth ast 1)]
              (EVAL (QUASIQUOTE a1) env))

            (= 'defmacro! a0)
            (let* [a1 (nth ast 1)
                   a2 (nth ast 2)
                   f (EVAL a2 env)
                   m (or (meta f) {})
                   mac (with-meta f (assoc m "ismacro" true))]
              (env-set env a1 mac))

            (= 'macroexpand a0)
            (let* [a1 (nth ast 1)]
              (MACROEXPAND a1 env))

            (= 'try* a0)
            (if (or (< (count ast) 3)
                    (not (= 'catch* (nth (nth ast 2) 0))))
              (EVAL (nth ast 1) env)
              (try*
                (EVAL (nth ast 1) env)
                (catch* exc
                  (EVAL (nth (nth ast 2) 2)
                        (new-env env
                                 [(nth (nth ast 2)1)]
                                 [exc])))))

            (= 'do a0)
            (let* [el (eval-ast (rest ast) env)]
              (nth el (- (count el) 1)))

            (= 'if a0)
            (let* [cond (EVAL (nth ast 1) env)]
              (if (or (= cond nil) (= cond false))
                (if (> (count ast) 3)
                  (EVAL (nth ast 3) env)
                  nil)
                (EVAL (nth ast 2) env)))

            (= 'fn* a0)
            (fn* [& args]
              (EVAL (nth ast 2) (new-env env (nth ast 1) args)))

            "else"
            (let* [el (eval-ast ast env)
                  f (first el)
                  args (rest el)]
              (apply f args))))))))))


;; print
(def! PRINT (fn* [exp] (pr-str exp)))

;; repl
(def! repl-env (new-env))
(def! rep (fn* [strng]
  (PRINT (EVAL (READ strng) repl-env))))

;; core.mal: defined directly using mal
(map (fn* [data] (env-set repl-env (nth data 0) (nth data 1))) core_ns)
(env-set repl-env 'eval (fn* [ast] (EVAL ast repl-env)))
(env-set repl-env '*ARGV* (rest *ARGV*))

(rep (str "(def! *host-language* 'JKL)"))
(rep "(def! not (fn* [a] (if a false true)))")
(rep "(def! load-file (fn* (f) (eval (read-string (slurp-do f)))))")
(rep "(defmacro! cond (fn* (& xs) (if (> (count xs) 0) (list 'if (first xs) (if (> (count xs) 1) (nth xs 1) (throw 'odd-num)) (cons 'cond (rest (rest xs)))))))")
(rep "(def! *gensym-counter* (atom 0))")

(rep "(def! gensym (fn* [] (symbol (str 'G__ (swap! *gensym-counter* (fn* [x] (+ 1 x)))))))")
(rep "(defmacro! or (fn* (& xs) (if (empty? xs) nil (if (= 1 (count xs)) (first xs) (let* (condvar (gensym)) `(let* (~condvar ~(first xs)) (if ~condvar ~condvar (or ~@(rest xs)))))))))")

;; repl loop
(def! repl-loop (fn* []
  (let* [line (readline "mal-user> ")]
    (if line
      (do
        (if (not (= "" line))
          (try*
            (println (rep line))
            (catch* exc
              (println "Uncaught exception:" exc))))
        (repl-loop))))))


(def! main (fn* [& args] 
  (if (> (count args) 0)
    (prn "Ignoring args")
    (repl-loop))))

(apply main *ARGV*)

So, I launched JKL and then used load-file to load the Step A source code, thus launching a Hosted-Mal interpreter. I then immediately used load-file to load the Step A source code on top of Hosted-Mal - so that my self-hosting Lisp is itself self-hosting. I then tried using the or macro, defining a function (a Fibonacci calculator), and then calling the new function. Amazingly, it all works! Here’s a transcript of the terminal session.

*** Welcome to JK's Lisp ***
JKL> (load-file "F:\JKL\HostedMal\stepA_mal.mal")
mal-user> (load-file "F:\JKL\HostedMal\stepA_mal.mal")
mal-user> (or false false 1)
1
mal-user> (def! fib (fn* (N) (if (= N 0) 1 (if (= N 1) 1 (+ (fib (- N 1)) (fib (- N 2)))))))
<fn [& args] (EVAL (nth ast 2) (new-env env (nth ast 1) args))>
mal-user> (fib 5)
8
mal-user>

The downside is that the resultant interpreter is very slow, taking approximately 10 seconds to run (fib 5).

Anyway, at this point, I’ve largely completed the Make-a-Lisp project. The outstanding issues are as follows: