Monday, December 5th, 2011

Today I’m going to visit two topics that I’ve not covered yet: lazy evaluation and types. Personally, the type system is the hardest thing to get my head around and I hope to write a lot more on the subject.

Lazy evaluation allows for the delay of evaluation until a required time. In Shen, lazy evalation is controlled with two functions: freeze and thaw. As the names suggest freeze delays evaluation and thaw evaluates a frozen expression.

Let’s jump into a REPL and try these out:

1 2 3 4 5 | (0-) (freeze 1) #<FUNCTION \:LAMBDA NIL 1> (1-) (thaw (freeze 1)) 1 |

Note that `(freeze ...)`

returns a function which is evaluated with thaw. This is almost the behaviour we get with a lambda (/.) except the returned function takes no parameters.

What I want to try here is to implement Clojure-like lazy-sequences in Shen. I’m not fully versed in how Clojure lazy-sequences work but I should be able to at least achieve some approximation of their behaviour.

The first function I’m going to implement is a lazy iterate. Iterate takes a function `F`

and a value `X`

and returns a lazy sequence of F applied to the previous value in the sequence: `[X (F X) (F (F X)) (F (F (F X))) ...]`

. The definition of iterate is as follows:

1 2 | (define iterate F X -> (freeze (cons X (iterate F (F X))))) |

The `(freeze (cons ...))`

pattern will be common to these lazy sequence generators so let’s clarify the notation with a macro:

1 2 | (defmacro lazy-cons-macro [lazy-cons X Y] -> [freeze [cons X Y]]) |

The definition of iterate then becomes:

1 2 | (define iterate F X -> (lazy-cons X (iterate F (F X)))) |

This version operates differently to Clojure’s iterate in that if I evaluate `(iterate sq 2)`

in Clojure the REPL will hang as it tries to evaluate and print an infinite list; in the Shen version iterate just returns a function.

1 2 | (5-) (iterate (+ 1) 0) #<COMPILED-FUNCTION iterate-1> |

Now that we have a working iterate it would be useful to be able to take values from our infinite lists. We define take to do just this:

1 2 3 4 5 6 7 | (define take-rec 0 X Acc -> (reverse Acc) N X Acc -> (let TX (thaw X) (take-rec (- N 1) (tail TX) [(head TX) | Acc]))) (define take N X -> (take-rec N X [])) |

1 2 | (8-) (take 10 (iterate (+ 1) 0)) [0 1 2 3 4 5 6 7 8 9] |

Cool...but we can do better than that. Turning on the type checker will yield type errors as we’ve not supplied any type declarations. As lazy lists are something you might want to use in typed code it is sensible to supply types.

Let’s introduce a recursive type `lazy-seq`

. `lazy-seq`

will be polymorphic so that we have, for example, `(lazy-seq number)`

, `(lazy-seq string)`

. The definition of `lazy-seq`

is:

1 2 3 4 5 6 7 8 9 | (datatype lazy-seq-type X : (lazy (list A)); ___________________ X : (lazy-seq A); X : A; Y : (lazy-seq A); _______________________ (freeze (cons X Y)) : (lazy-seq A);) |

1 2 | (11+) (freeze (cons 1 (freeze (cons 2 ())))) #<FUNCTION :LAMBDA NIL (CONS 1 (freeze (CONS 2 NIL)))> : (lazy-seq number) |

Now we can define a typed version of iterate:

1 2 3 | (define iterate { (A --> A) --> A --> (lazy-seq A) } F X -> (lazy-cons X (iterate F (F X)))) |

Similarly, we can redefine `take-rec`

:

1 2 3 4 5 | (define take-rec { number --> (lazy-seq A) --> (list A) --> (list A) } 0 X Acc -> (reverse Acc) N X Acc -> (let TX (thaw X) (take-rec (- N 1) (tail TX) [(head TX) | Acc]))) |

...but we have a problem. This doesn’t type check because head and tail are typed to work only with `(list A)`

types and thaw will only accept `(lazy A)`

types. We need to promise the type checker that thaw can be applied to a `lazy-seq`

and that head and tail can be applied to a thawed `lazy-seq`

. `lazy-seq`

can be redefined as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | (datatype lazy-seq-type X : (lazy (list A)); ___________________ X : (lazy-seq A); X : A; Y : (lazy-seq A); _______________________ (freeze (cons X Y)) : (lazy-seq A); X : (lazy-seq A); ________________ (thaw X) : (thawed-seq A); X : (thawed-seq A); __________________ (head X) : A; X : (thawed-seq A); __________________ (tail X) : (lazy-seq A); X : (lazy-seq A); ________________ (= (thaw X) []) : boolean;) |

1 2 | (15+) (thaw (iterate (+ 1) 0)) [0 | #<COMPILED-FUNCTION iterate-1>] : (thawed-seq number) |

The important thing to realise is that Shen doesn’t just accept types for data structures but that any expression can be typed. The previous definition of `take-rec`

now type checks and we define take as:

1 2 3 | (define take { number --> (lazy-seq A) --> (list A) } N X -> (take-rec N X [])) |

1 2 | (19+) (take 10 (iterate (+ 1) 0)) [0 1 2 3 4 5 6 7 8 9] : (list number) |

Great! Let’s try a lazy map:

1 2 3 | (define lmap { (A --> B) --> (lazy-seq A) --> (lazy-seq B) } F X -> (lazy-cons (F (head (thaw X))) (lmap F (tail (thaw X))))) |

1 2 | (21+) (take 10 (lmap (* 2) (iterate (+ 1) 0))) [0 2 4 6 8 10 12 14 16 18] : (list number) |

The problem with `take`

and `lmap`

so far is that they only work with lazy sequences. What we’d really like is to apply take and lmap to lists as well as lazy sequences. To do this we’ll introduce a new datatype seq that encompasses both `list`

and `lazy-seq`

types.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | (datatype seq-type X : (list A); ___________ X : (seq A); X : (seq A); ____________ (head X) : A; X : (seq A); ___________________ (tail X) : (list A); X : (lazy-seq A); ___________ X : (seq A); X : (seq A); ___________________ (head (thaw X)) : A; X : (seq A); _______________________________ (tail (thaw X)) : (lazy-seq A);) |

Let’s use this definition to define `seq`

equivalents to `head`

and `tail`

; I’ll call them `first`

and `rest`

. They are defined as follows:

1 2 3 4 5 6 7 8 9 | (define first { (seq A) --> A } X -> (head X) where (cons? X) X -> (head (thaw X))) (define rest { (seq A) --> (seq A) } X -> (tail X) where (cons? X) X -> (tail (thaw X))) |

1 2 3 4 5 | (25+) (first (iterate (+ 1) 0)) 0 : number (26+) (first [0 1 2 3]) 0 : number |

There is an issue with the above in that `(head X)`

and `(head (thaw X))`

will type check for either a `list`

or a `lazy-seq`

. This will yield a run-time error if we try, for example, `(head (iterate (+ 1) 0))`

or `(head (thaw [1 2 3]))`

. Here’s a more type-secure definition of `seq`

using verified objects:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | (datatype seq-type ____________________________________ (cons? X) : verified >> X : (list A); X : (list A); ___________ X : (seq A); ____________________________________ (notcons? X) : verified >> X : (lazy-seq A); X : (lazy-seq A); ___________ X : (seq A);) (define notcons? { A --> boolean } X -> (not (cons? X))) |

I’ve removed the `(head X)`

an `(head (thaw X))`

types for a `(seq A)`

and replaced them with a verified type for lists and lazy sequences. This allows the following redefined `first`

and `rest`

to type check:

1 2 3 4 5 6 7 8 9 | (define first { (seq A) --> A } X -> (head X) where (cons? X) X -> (head (thaw X)) where (notcons? X)) (define rest { (seq A) --> (seq A) } X -> (tail X) where (cons? X) X -> (tail (thaw X)) where (notcons? X)) |

We can now go back and redefine `take-rec`

, `take`

and `lmap`

to work on `seq`

:

1 2 3 4 5 6 7 8 9 10 11 12 13 | (define take-rec { number --> (seq A) --> (list A) --> (list A) } 0 X Acc -> (reverse Acc) N [] Acc -> (reverse Acc) N X Acc -> (take-rec (- N 1) (rest X) [(first X) | Acc])) (define take { number --> (seq A) --> (list A) } N X -> (take-rec N X [])) (define lmap { (A --> B) --> (seq A) --> (lazy-seq B) } F X -> (lazy-cons (F (first X)) (lmap F (rest X)))) |

1 2 3 4 5 | (34+) (take 2 [1 2 3]) [1 2] : (list number) (35+) (take 2 (iterate (+ 1) 1)) [1 2] : (list number) |

There is a remaining issue where attempting to take N elements of a lazy-mapped list where N is greater than the length of the list will yield an error. We redefine `rest`

to check if we have more elements or not and offer a new definition of `lmap`

:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | (define rest-lazy-seq { (lazy-seq A) --> (seq A) } X -> [] where (= (thaw (tail (thaw X))) []) X -> (tail (thaw X))) (define rest { (seq A) --> (seq A) } [] -> [] X -> (tail X) where (cons? X) X -> (rest-lazy-seq X) where (notcons? X)) (define lmap { (A --> B) --> (seq A) --> (lazy-seq B) } F [] -> (freeze []) F X -> (lazy-cons (F (first X)) (lmap F (rest X)))) |

1 2 3 4 5 | (39+) (take 10 [1 2 3]) [1 2 3] : (list number) (40+) (take 10 (lmap (* 2) [1 2 3])) [2 4 6] : (list number) |

Here’s the final version of our lazy list code:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 | (defmacro lazy-cons-macro [lazy-cons X Y] -> [freeze [cons X Y]]) (datatype lazy-seq-type X : (lazy (list A)); ___________________ X : (lazy-seq A); X : A; Y : (lazy-seq A); _______________________ (freeze (cons X Y)) : (lazy-seq A); X : (lazy-seq A); ________________ (thaw X) : (thawed-seq A); X : (thawed-seq A); __________________ (head X) : A; X : (thawed-seq A); __________________ (tail X) : (lazy-seq A); X : (lazy-seq A); ________________ (= (thaw X) []) : boolean;) (datatype seq-type ____________________________________ (cons? X) : verified >> X : (list A); X : (list A); ___________ X : (seq A); ____________________________________ (notcons? X) : verified >> X : (lazy-seq A); X : (lazy-seq A); ___________ X : (seq A);) (define notcons? { A --> boolean } X -> (not (cons? X))) (define first { (seq A) --> A } X -> (head X) where (cons? X) X -> (head (thaw X)) where (notcons? X)) (define rest-lazy-seq { (lazy-seq A) --> (seq A) } X -> [] where (= (thaw (tail (thaw X))) []) X -> (tail (thaw X))) (define rest { (seq A) --> (seq A) } [] -> [] X -> (tail X) where (cons? X) X -> (rest-lazy-seq X) where (notcons? X)) (define take-rec { number --> (seq A) --> (list A) --> (list A) } 0 X Acc -> (reverse Acc) N [] Acc -> (reverse Acc) N X Acc -> (take-rec (- N 1) (rest X) [(first X) | Acc])) (define take { number --> (seq A) --> (list A) } N X -> (take-rec N X [])) (define lmap { (A --> B) --> (seq A) --> (lazy-seq B) } F [] -> (freeze []) F X -> (lazy-cons (F (first X)) (lmap F (rest X)))) |

There we have it, a basic implementation of typed lazy lists with a Clojure-like interface. I’m beginning to get a better feel for the type system and I hope this post will be of use to others. As always, let me know what you think in the comments below.

*Update: You can find the code for this post on my github page which includes changes suggested by Mark Thom.*