Environments and stack tracing
TBD - write intro para
Stack tracing - errors with line numbers and/or stack traces - is one of the recommended Next Steps in the MAL guide. I decided I needed stack tracing to support my implementation of loop
(see the separate deep dive) after realising that the tracing and looping mechanisms have some common requirements and mechanisms.
The intent is that when an exception is thrown, JKL
should print something like a stack trace of the preceeding function calls, to help debugging. In JKL
, the closest thing to a function stack is the set of nested env
objects that hold the environmental context (symbols and their values) for evaluation. My initial thinking was that when an exception is raised, the current env
should be passed to the exception handler so that the context for the exception can be written to the error log (currently the console).
This post describes the changes required to JKL
1.0 to enable stack tracing, starting with a quick recap of the environment objects that support my planned approach.
Background
Environments in JKL
1.0
The EVAL
method takes an env
object as an argument - each of which contains a C# dictionary that maps symbols to values. Symbols and their values are added to an env
by the def!
special form. When a symbol is encountered during evaluation, JKL
searches for the symbol’s value using the Find
method of the local env
.
The first env
is created by the top-level REPL, before any builtin functions and other startup values are loaded. During evaluation, additional env
objects are created:
- By
fn*
, to bind argument names and their values. Theseenv
objects are stored with the functions themselves, providingJKL
’s closure mechanism - By
let*
, to hold thelet
variables - By
catch
, to store the values ofthrow
statements - By
EVAL
, when executing functions created byfn*
Each env
stores a pointer to the parent env
within which it was created (or null
for the outermost REPL), thus providing a nested scoping mechanism for symbols that have the same name. If Find
spots the symbol in the local env
, it returns it. Otherwise it recursively examines the parent env
.
Stack tracing using env
Changes required
env
objects clearly hold some of the basic information required for stack tracing. However, there are some limitations:
env
objects do not record the necessary contextual information, such as the reason they were created (e.g. because of alet*
) or the origin of their symbols (such as the line number of the file, if any, that introduced the symbol)env objects
are passed to successiveEVAL
calls and are thus directly accessible withinEVAL
itself. However, theenv
is not passed to built-in functions or the methods associated withJKL
types (such as arithmetic operations or sequence access), and thus cannot in turn be passed to exception handlers when errors are detected in these places
The rest of this section describes how I addressed these limitations.
Annotating env
objects
To add annotations to env
objects I derived subclasses of env
class to represent the different reasons for which an environment is created (REPL, fn*
, let*
, etc) and gave each subclass a different label for printing (initially I used an enum for the same purpose before realising that subclasses would be more convenient). I then allowed the subclass for functions (EnvFnCall
) to store the function associated with the env
so that this could be printed out in the stack trace.
Making env
objects accessible
Making the current env
accessible to exception handlers was more challenging. There seemed to be two broad approaches. The first is to try to pass the current env
object (e.g. as an argument) to every place where errors could be thrown. This would require a massive number of changes (literally 100s) across the entire C# code base.
The second approach is to establish a single record of the current env
, and to update this when a new evaluation context is established, and again at the beginning of every EVAL
. This should at least track exceptions raised during reading and evaluation, which are the vast majority of cases.
First try
Here’s the result of using the new stack trace mechanism. The error occurs in a function f
that takes a single argument, binds the arg to a local let*
variable, and then tries to numerically add the let variable value to a string:
*** Welcome to JK's Lisp ***
JKL> (def! f (fn* (fnArg) (let* (n fnArg) (+ n "wtf"))))
<fn (fnArg) (let* (n fnArg) (+ n "wtf"))>
JKL> (f 1)
Eval error: Plus - expected a number but got: '"wtf"'
In let*
In <fn (fnArg) (let* (n fnArg) (+ n "wtf"))>
In REPL
This is looking good! But what happens if the error occurs in a nested function call?
...
JKL> (def! g (fn* (n) (f n)))
<fn (n) (f n)>
JKL> (g 1)
Eval error: Plus - expected a number but got: '"wtf"'
In let*
In <fn (fnArg) (let* (n fnArg) (+ n "wtf"))>
In REPL
JKL
is correctly printing the immediate context for the error (the let*
in f
) but the fact that f
was called by g
isn’t shown. Here’s another more complex example, involving f
as before, but now called from within a closure (a dynamically defined function) itself declared in a let
in a separate REPL-level function g
:
(def! f (fn* (fArg)
(let* (n fArg)
(+ n "wtf"))))
(def! g (fn* (gArg)
(let* (closureFn (fn* (clArg)
(prn "In closure") (f clArg)))
(closureFn gArg))))
When g
is invoked, we get…
...
JKL> (g 42)
"In closure"
Eval error: Plus - expected a number but got: '"wtf"'
In let*
In <fn (fArg) (let* (n fArg) (+ n "wtf"))>
In REPL
We aren’t seeing a full call stack because the env
objects (as currently implemented) were not in fact designed to record a sequence of function calls. Instead, their purpose is to record the context (hence ‘environment’) in which a function was defined, and to recover this context when the body of the function is evaluated. Specifically, when EVAL
has to evaluate a non-core function (one created by fn*
as opposed to say an arithmetic operation) it:
- Creates a new evalution
env
that consists of the one that was in force when the function was defined (which is the REPL for bothf
ANDg
) and which is stored with the function - Adds to this
env
new symbols created by binding the function’s arguments to the values being passed to the function - Executes the body of the function in the new
env
(via TCO)
So, at this point, I don’t have a full stack trace mechanism. However, the current functionality should be sufficient to support the required by loop
and recur
(the successful implementation of which proved this point.
Additional changes required to support loop
and recur
were as follows:
- Storage of binding symbol names so that these could be rebound
- Storage of body forms so that these can be re-evaluated once a
recur
has set bindings up - A
rebind
function that allows an existing symbol in anenv
to be rebound