Lambda Interpreter, Part II, Semantics
Posted: October 24, 2012 Filed under: Lambda Calculus, ML Leave a comment
First thing, we need to be able to define identifiers, at least at the top level, for S
, K
, I
, etc. A simple list of string-value pairs will do, we will fill in our default environment later:
fun getenv [] v = NONE | getenv ((v',e)::s) v = if (v = v') then SOME e else getenv s v; (* Make an environment from a list of strings *) fun mkenv [] k = k | mkenv (v::e::s) k = mkenv s ((v,parse e)::k)
We will define two evaluation mechanisms, one will reduce one redex at a time, with the index of the redex passed in to the function, the other will just reduce until a normal form is reached, if there is one.
It’s convenient to use option types to indicate if a subexpression has been reduced, and it’s even more convenient to define some monadic-style helper functions:
fun lift f NONE = NONE | lift f (SOME x) = SOME (f x); fun bind f NONE = NONE | bind f (SOME a) = f a; (* Apply f to a, if no result, apply g to b *) fun try f a g b = case f a of r as SOME _ => r | NONE => g b; fun get (SOME x) = x;
Here’s the single redex reducer: try to reduce each subexpression, keeping track of redex indexes. Return SOME(e')
if a sub evaluation succeeds, else return NONE
with some monadic trickery to handle the mechanics. We avoid expanding global identifiers unless they are in functional position.
fun reduce env n e = let fun aux n (Var _) = NONE | aux 0 (App(Lambda(v,e1),e2)) = SOME (subst v e2 e1) | aux n (App(Lambda(v,e1),e2)) = try (lift (fn e1' => App(Lambda(v,e1'),e2)) o aux (n-1)) e1 (lift (fn e2' => App(Lambda(v,e1),e2')) o aux (n-1)) e2 | aux n (App(e1 as Var v,e2)) = try (bind (fn e1' => aux n (App(e1',e2))) o getenv env) v (lift (fn e2' => App(e1,e2')) o aux n) e2 | aux n (App(e1,e2)) = try (lift (fn e1' => App(e1',e2)) o aux n) e1 (lift (fn e2' => App(e1,e2')) o aux n) e2 | aux n (Lambda(v,e1)) = (lift (fn e1' => Lambda(v,e1')) o aux n) e1 in aux n e end;
That’s all very well for reducing individual redexes, let’s define something that will just let rip on an expression. We’d like to do a proper normal order evaluation, so we’ll use an evaluation stack: go down left branch of expression, reducing beta redexes as we go. When we have got to the end, there are no more top-level redexes, so recursively evaluate the items on the stack, and finally fold back up in into a single expression:
fun eval env e = let fun foldapp(e1::e2::s) = foldapp(App(e1,e2)::s) | foldapp ([e1]) = e1; fun aux (Lambda(v,e1)) (e2::s) = aux (subst v e2 e1) s | aux (Lambda(v,e1)) [] = Lambda(v, eval env e1) | aux (App(e1,e2)) s = aux e1 (e2::s) | aux (e as Var _) [] = e | aux (e as Var v) s = (case getenv env v of SOME e' => aux e' s | _ => foldapp (map (eval env) (e::s))); in aux e [] end;
All we need now is a suitable environment, and we can start reducing.
val stdenv = mkenv ["S", "λxyz.(xz)(yz)", "K", "λxy.x", "I", "λx.x", "Y", "λf.(λx.f(xx))(λx.f(xx))", "M", "λxy.y(xy)", "T", "λxy.x", "F", "λxy.y", "Z", "λn.n(λx.F)T", "0", "λf.λn.n", "N", "λn.λf.λx.f(nfx)", "P", "λnfx.n(λgh.h(gf))(λu.x)(λu.u)", "*", "λmnfx.m(nf)x", "+", "λmnfx.mf(nfx)", "1", "N0", "2", "N1", "3", "N2", "4", "N3", "5", "N4", "6", "N5", "7", "N6", "8", "N7", "9", "N8", "H", "Y(λgn.(Zn)1(*n(g(Pn))))" ] [];
As well as the usual suspects, M
is my favourite combinator. T
and F
help define conditionals, and there is the usual definition of Church numerals. H
, obviously, is the factorial function.
Finally, we define some convenience functions:
fun run e = case reduce stdenv 0 (pp e) of SOME e' => run e' | NONE => e; (* Evaluate first redex *) fun e0 e = pp (get (reduce stdenv 0 e)); (* Evaluate second redex *) fun e1 e = pp (get (reduce stdenv 1 e)); (* Evaluate a symbol *) fun esym s = pp (get (getenv stdenv s)); (* Evaluate to normal form *) fun e s = pp (eval stdenv (parse s));
And now at last we can do some reductions:
First we probably should check that variable capture is handled properly:
- e "(\\abcd.abcd)xyzw"; xyzw val it = App (App (App #,Var #),Var "w") : Exp - e "(\\vxx'x''.vxx'x'')xyzw"; xyzw val it = App (App (App #,Var #),Var "w") : Exp
It’s interesting that in the first reduction of the latter, there is a sort of cascade of primes being added:
- e0 (parse "(\\vxx'x''.vxx'x'')xyzw"); (λx'x''x'''.xx'x''x''')yzw val it = App (App (App #,Var #),Var "w") : Exp
Anyway, that seems to check out so we can move on to something more interesting. Let’s derive the Turing fixpoint combinator from M
and Y
:
- e1(e1(e0(parse"YM"))); (λx.M(xx))(λx.M(xx)) (λxy.y(xxy))(λx.M(xx)) (λxy.y(xxy))(λxy.y(xxy)) val it = App (Lambda ("x",Lambda #),Lambda ("x",Lambda #)) : Exp
Nice. And we can do factorials:
- e "H4"; λfx.f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(fx))))))))))))))))))))))) val it = Lambda ("f",Lambda ("x",App #)) : Exp
The output is a little big to put inline, but the enthusiastic can try:
run (parse "H2");
See [[here]] for output.
Lambda Interpreter, Part I, Syntax.
Posted: October 9, 2012 Filed under: Lambda Calculus, ML Leave a commentHere’s a simple lambda calculus interpreter I wrote a little while ago. It’s in ML, a wonderful language for its polymorphic type inference, pattern matching, and effortless higher order functions.
We start off nice and simple with some abstract syntax:
type Variable = string; datatype Exp = Var of Variable | App of Exp * Exp | Lambda of Variable * Exp;
We could use de Bruijn indices for variables, making variable substitution easier, but I’m a bit of a traditionalist and it’s not too difficult to do it the hard way.
First job is printing expressions, a nice little recursive scanner will do, and we can sort out bracketing, eliding λx.λy.e
to λxy.e
and so on with some nifty pattern matching:
fun pexp (Var x) = x | pexp (e as Lambda _) = "λ" ^ plambda e | pexp (App(e1 as App _,e2)) = pexp e1 ^ pexp1 e2 | pexp (App(e1,e2)) = pexp1 e1 ^ pexp1 e2 and pexp1 (Var x) = x | pexp1 e = "(" ^ pexp e ^ ")" and plambda(Lambda(v,e)) = v ^ plambda e | plambda e = "." ^ pexp e; fun pp e = (print (pexp e ^ "\n"); e)
Next up, variable substitution, a little tedious, but has to be done. First, we need to know if a variable occurs free in an expression. If it does, we need to find a variable that doesn’t, which we do by decorating with primes. Having got ourselves a variable that isn’t free, we can use it to substitute for the one that is, and that’s all there is to it. The code is probably clearer:
fun isfree c (Var c') = c = c' | isfree c (Lambda(c',e)) = if (c = c') then false else isfree c e | isfree c (App(e1,e2)) = if isfree c e1 then true else isfree c e2; fun occurs c (Var c') = c = c' | occurs c (Lambda(c',e)) = if (c = c') then true else occurs c e | occurs c (App(e1,e2)) = if occurs c e1 then true else occurs c e2; (* Add primes to variable until we find one not occurring in e *) fun findnew v e = let val v' = v ^ "'" in if not (occurs v' e) then v' else findnew v' e end; fun subst v e1 (e2 as (Var v')) = if v = v' then e1 else e2 | subst v e1 (App(e2,e3)) = App(subst v e1 e2, subst v e1 e3) | subst v e1 (e2 as Lambda(v',e3)) = if not (isfree v e2) then e2 (* Includes case v = v' *) else if isfree v' e1 then (* Find a variable not in e1 or e3 to use *) let val v'' = findnew v' (App(e1,e3)) in subst v e1 (Lambda(v'', subst v' (Var v'') e3)) end else Lambda(v', subst v e1 e3);
Phew, glad that’s over. Next, we need to lex and parse expressions. Lexing is straightforward, variables are single letters, we allow either \
or λ
for lambda; it seems that we get the UTF-8 bytes for λ
one at a time, so that’s just about our only multi character token, apart from primed identifiers (since we use primes to avoid variable capture in substitutions, we want to be able to read them in as well). Lex input is just a list of characters from an exploded string.
datatype LexItem = LAM | BRA | KET | DOT | VAR of string; fun lex [] t = rev t | lex (#" "::s) t = lex s t | lex (#"\n"::s) t = lex s t | lex (#"\t"::s) t = lex s t | lex (#"\\"::s) t = lex s (LAM::t) | lex (#"("::s) t = lex s (BRA::t) | lex (#")"::s) t = lex s (KET::t) | lex (#"."::s) t = lex s (DOT::t) | lex (#"\206" :: #"\187" ::s) t = lex s (LAM::t) | lex (c::s) t = lexvar s [c] t and lexvar (#"'"::s) v t = lexvar s (#"'"::v) t | lexvar s v t = lex s (VAR (implode(rev v))::t);
Parsing is even more fun. This is a hand-built LR table-driven parser; table driven parsers are good for tricks like semi-intelligent error recovery or doing parse conflict resolution on the fly (useful eg. for languages with redefinable operation precedences like ML). We don’t do anything like that though, we don’t even explicitly detect errors, and instead rely on ML’s pattern matching – if the input is ungrammatical, we get an inexhaustive match error:
fun parse s = let (* Items appearing on the parse stack *) datatype ParseItem = B | E of Exp | V of Variable; fun aux [E e] [] = e | aux (E e1::E e2::s) t = aux (E(App(e2,e1))::s) t | aux s (LAM::VAR c::DOT::t) = aux (V c::s) t | aux s (LAM::VAR c::VAR c'::t) = aux (V c::s) (LAM::VAR c'::t) | aux s (BRA::t) = aux (B::s) t | aux ((a as E _):: B :: s) (KET::t) = aux (a::s) t | aux s (VAR c::t) = aux(E(Var c)::s) t | aux (E e::V v::s) t = aux (E(Lambda(v,e))::s) t; in aux [] (lex (explode s) []) end
Well, that’s it for the moment. Next installment – evaluating expressions.
Templated bit twiddling
Posted: October 6, 2012 Filed under: Bit twiddling Leave a commentSometimes it’s nice to combine C++ template metaprogramming with some bit twiddling, here’s a classic parallelized bit counting algorithm with all the magic numbers constructed as part of template instantiation:
#include <stdio.h> #include <stdlib.h> template <typename T, int k, T m> struct A { static T f(T n) { static const int k1 = k/2; n = A<T, k1, m^(m<<k1)>::f(n); return ((n>>k)&m) + (n&m); } }; template <typename T, T m> struct A<T, 0, m> { static T f(T n) { return n; } }; template<typename T> static inline int bitcount (T n) { static const int k = 4 * sizeof(n); return A<T, k, (T(1)<<k)-1>::f(n); } int main(int argc, char *argv[]) { unsigned long n = strtoul(argv[1], NULL, 10); printf("bitcount(%lu) = %d\n", n, bitcount(n)); }
We could extract the magic number construction out as a separate template, and there are some extra optimizations that this code doesn’t do, for example, the level 1 code can be simplified to:
i = i - ((i >> 1) & 0x55555555);
rather than what we get from the above:
i = ((i >> 1)&0x55555555) + (i&0x55555555);
Adding an appropriate partial template instantiation for A<T,1,m>
is left as an exercise.
In fact, there are lots of shortcuts that can be made here, we aren’t really being competitive with Bit Twiddling Hacks:
unsigned int v; // count bits set in this (32-bit value) unsigned int c; // store the total here static const int S[] = {1, 2, 4, 8, 16}; // Magic Binary Numbers static const int B[] = {0x55555555, 0x33333333, 0x0F0F0F0F, 0x00FF00FF, 0x0000FFFF}; c = v - ((v >> 1) & B[0]); c = ((c >> S[1]) & B[1]) + (c & B[1]); c = ((c >> S[2]) + c) & B[2]; c = ((c >> S[3]) + c) & B[3]; c = ((c >> S[4]) + c) & B[4];
which would need 3 template specializations and we are starting to get a bit complicated…
In the same style then, here’s a bit reverse function, with a separate magic number calculating template:
#include <stdio.h> #include <stdlib.h> template <typename T, int k> struct Magic { static const T m = ~T(0)/(T(1)+(T(1)<<k)); }; template <typename T, int k> struct RevAux { static inline T f(T n) { T n1 = RevAux<T,k/2>::f(n); return ((n1>>k) & Magic<T,k>::m) | ((n1 & Magic<T,k>::m)<<k); } }; template <typename T> struct RevAux<T,0> { static inline T f(T n) { return n; } }; template <typename T> T rev(T n) { return RevAux<T,4*sizeof(T)>::f(n); } int main(int argc, char *argv[]) { unsigned long n = strtoul(argv[1],NULL,16); printf("%lx %lx\n", n, rev(n)); }
Now, how do we check that the template parameter T is unsigned (otherwise the right shifts are probably going to go wrong)…
A simple optimization is to replace the top level function with:
template <typename T> static inline T rev(T n) { static const int k = 4*sizeof(T); n = RevAux<T,k/2>::f(n); return (n>>k)|(n<<k); }
but GCC seems to have worked that out for itself (compiling gcc 4.4.3 , -O3 -S):
.globl _Z7reversej pushl %ebp movl %esp, %ebp movl 8(%ebp), %eax popl %ebp movl %eax, %edx andl $1431655765, %edx shrl %eax addl %edx, %edx andl $1431655765, %eax orl %eax, %edx movl %edx, %eax andl $858993459, %eax shrl $2, %edx andl $858993459, %edx sall $2, %eax orl %edx, %eax movl %eax, %edx andl $252645135, %edx shrl $4, %eax andl $252645135, %eax sall $4, %edx orl %eax, %edx movl %edx, %eax andl $16711935, %eax shrl $8, %edx sall $8, %eax andl $16711935, %edx orl %edx, %eax rorl $16, %eax ret
This sort of thing isn’t going to work very well if the compiler isn’t up to scratch, but GCC seems to have done it right here.
There are slightly faster ways to reverse bits using ternary swap (swap top and bottom thirds, recurse), but they need to be more ad hoc unless you have a ternary computer with word length a power of 3, so less amenable to this sort of generic treatment (or at least, it becomes a lot harder).
An Interesting Combinator
Posted: October 5, 2012 Filed under: Lambda Calculus Leave a commentA fixpoint operator F
is such that:
ie. F = λf.f(Ff)
ie. F
is a fixpoint of λF.λf.f(Ff)
ie. fixpoints are fixpoints of M = λx.λy.y(xy)
Church’s Y
operator is a fixpoint operator:
Y = λf.(λx.f(xx))(λx.f(xx))
[Yf = (λx.f(xx))(λx.f(xx)) [beta]
= f((λx.f(xx))(λx.f(xx))) [beta]
= f(Yf) [subst of equals]]
Let’s apply Y
to M
and call the result T:
T = YM = (λx.M(xx))(λx.M(xx))
= (λx.λy.y(xxy))(λx.λy.y(xxy)) [since λx.M(xx) = λx.λy.y(xxy)]
T
is the Turing fixpoint combinator. Isn’t that strange? No doubt there is some deep reason behind this, but what it might be I have no idea. Perhaps it’s related to the fact that Yf
needs substitution of equals to show equivalence with f(Yf)
, whereas T(f)
directly reduces to f(Tf)
:
Tf = (λx.λy.y(xxy))Bf [def]
= f(BBf) [beta]
= f(Tf) [def]
A Familiar Series?
Posted: October 4, 2012 Filed under: Floating Point Leave a comment
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 18 20 22 24 26 28 30
32 36 40 44 48 52 56 60
64 72 80 88 96 104 112 120
128 144 160 176 192 208 224 240
256 288 320 352 384 416 448 480
512 576 640 704 768 832 896 960
1024 1152 1280 1408 1536 1664 1792 1920
2048 2304 2560 2816 3072 3328 3584 3840
4096 4608 5120 5632 6144 6656 7168 7680
8192 9216 10240 11264 12288 13312 14336 15360
16384 18432 20480 22528 24576 26624 28672 30720
32768 36864 40960 45056 49152 53248 57344 61440
65536 73728 81920 90112 98304 106496 114688 122880
131072 147456 163840 180224 196608 212992 229376 245760
262144 294912 327680 360448 393216 425984 458752 491520
524288 589824 655360 720896 786432 851968 917504 983040
1048576 1179648 1310720 1441792 1572864 1703936 1835008 1966080
2097152 2359296 2621440 2883584 3145728 3407872 3670016 3932160
4194304 4718592 5242880 5767168 6291456 6815744 7340032 7864320
In fact, they are floating point numbers, or rather, integers with the same distribution as floating point numbers, and just need some appropriate scaling, for example by dividing by 8388608, to give a nice set of numbers in the interval [0,1).
Notice that for all but the first row, every entry in a row has the same number of digits in its binary representation – they are normalized, with the first row corresponding to the denormalized numbers of IEEE-754. Without the denormalized numbers, there would be a great gap around zero.
Also, the numbers in each row are the same distance apart, with the gap at row n+1 being twice the gap at row n. Again, the first row is the exception, the gap here is the same as the second row – so the first 16 numbers form a uniformly spaced sequence from 0 – there isn’t anything very peculiar about the denormalized numbers, they just represent a continuation of the first row of normalized numbers to fill the gap down to zero.
We can see how a scaled comparison of floats might work too – within each row, two elements are close if they are within n places of each other, or equivalently, if they are within some constant multiple of the row gap. For elements in different rows, we can either just carry on counting the number of intervening numbers, or continue using the row gap to define a neighbourhood of each number. For example, if “close” means “twice the row gap”, then 22 is close to 18, 20, 24 and 26; 16 is close to 12, 13, 14. 15, 18 and 20; and 15 is close to just 13, 14 and 16. This notion of closeness is discussed by Knuth in TAOCP.
We don’t have to insist on base 2 of course, here’s a similar sequence based on powers of 3, with 10 elements in each segment. This gives us 5 denormalized values:
0 1 2 3 4
5 6 7 8 9 10 11 12 13 14
15 18 21 24 27 30 33 36 39 42
45 54 63 72 81 90 99 108 117 126
135 162 189 216 243 270 297 324 351 378
405 486 567 648 729 810 891 972 1053 1134
1215 1458 1701 1944 2187 2430 2673 2916 3159 3402
3645 4374 5103 5832 6561 7290 8019 8748 9477 10206
10935 13122 15309 17496 19683 21870 24057 26244 28431 30618
32805 39366 45927 52488 59049 65610 72171 78732 85293 91854
98415 118098 137781 157464 177147 196830 216513 236196 255879 275562
295245 354294 413343 472392 531441 590490 649539 708588 767637 826686
885735 1062882 1240029 1417176 1594323 1771470 1948617 2125764 2302911 2480058
Once again, each row, except the first, every number has the same number of digits in its base 3 representation.