evan_tech

Previous Entry Share Next Entry
I'd seen "lazy functional state threads" described before and mostly understood it -- the first step is realizing that they hijacked the meaning of "threads" for something entirely different than what "threads" normally means. But part of the difficulty in understanding some of the cool PL stuff going on is that a lot of it builds on older pieces. So any discussion of lazy functional state threads (hereafter LFST) immediately starts describing them in terms of monads, etc. and it all comes crashing in at once.

This is why their original paper from 1994 (more than 10 years ago!) is interesting: it predates a bunch of stuff and so the exposition is simpler.

The basic problem is that you have two seemingly conflicting goals that you'd like to resolve:
  1. Pure functional code does not have side effects, and (consequently) the evaluation order basically doesn't matter.
  2. Some useful algorithms, while externally appearing referentially transparent (same inputs always produce same outputs) are best internally implemented with mutable, sequenced state (especially arrays -- think of graph algorithms).

So what do you do? Well, it's Haskell, so the approach is always an abstraction with some fancy types. The abstraction is representing the state as an abstract type and providing some sequencing operators (like "first do a, then do b"). Then a "runWithState" function takes all that and runs it in order, finally returning the state-free result of the computation.

But more interesting (and as they write, "the main technical contribution of the paper") is how to keep state from leaking. How do you statically prevent the function from just returning the array? (This would be bad because once you're outside of the state-ordered function you're back in no-evaluation-order land and other stateful snippets that operate on the array could happen in any order.)

Here's where the fancy types come in. They define the function:
runWithState :: ∀a. (∀s. ST s a) -> a
Where ST s a represents the stateful computation, and a is the return type.

First, an aside on the fancy type:
See the forall nested in the argument? That's called rank-2 polymorphism, and is one of the limitations of the Hindley-Milner type inference (also known as "ML-style type inference"): you can only type-infer rank-1 polymorphism (a rule which is often described as "all the foralls go out in the front"). So the polymorphism there must be written explicitly.

Typically (and pretty much the whole point of an ML-style type system) polymorphism is inferred for free. A function that swaps the elements of a pair of any type, like:
swap (x,y) = (y,x)
has its type inferred and typically written down as
swap :: (a,b) -> (b,a)
But what that really means is that it's automatically inferred to have type
swap :: ∀a b. (a,b) -> (b,a)
-- that's what having "variables" in a type means: a universal quantification.

So this runWithState function actually requires the programmer (here, the Haskell standard library author) to explicitly annotate the type. And programmers have enough trouble understanding rank-1 polymorphism in general.

(For more applications of higher-rank polymorphism, see section 2 of Sexy types in action, where LFST gets barely a paragraph's worth of description.)
But what does it mean? It's easy enough to see that the return type ought to be a forall: I want to be able to run many different stateful computations which can return different types, as long as the computation matches the return type -- it's the same a in ST s a and the result a.

And the computation carries a polymorphic s as well, because when I combine two operations I want them to operate over the same sort of state. That is, to combine two operations clearMatrix and identityMatrix, the type of the "sequence" operator correctly requires both to use the same type of internal state (presumably something involving a matrix).

Finally, the forall inside the argument to runWithState means that to be able to run a given stateful computation and get back a non-stateful result, the computation must work for all possible sorts of initial states -- that's what it means to be referentially transparent! If the forall were out in the front, it would be saying that runWithState could be ran for any sort of input state; by putting it in the argument, it's saying that it must be independent of input state and simultaneously that the type of the output cannot depend on it (the quantification of s stops before we get to the output argument).

Polymorphism is weird in that you declare that something is general for all types -- which feels like an expansion of power -- but because you can't know anything about the input type, you're restricted to treating the polymorphic value as a black box: you can't do any operations on it other than pass it around.