This is why their original paper from 1994 (more than 10 years ago!) is interesting: it predates a bunch of stuff and so the exposition is simpler.
The basic problem is that you have two seemingly conflicting goals that you'd like to resolve:
- Pure functional code does not have side effects, and (consequently) the evaluation order basically doesn't matter.
- Some useful algorithms, while externally appearing referentially transparent (same inputs always produce same outputs) are best internally implemented with mutable, sequenced state (especially arrays -- think of graph algorithms).
So what do you do? Well, it's Haskell, so the approach is always an abstraction with some fancy types. The abstraction is representing the state as an abstract type and providing some sequencing operators (like "first do a, then do b"). Then a "runWithState" function takes all that and runs it in order, finally returning the state-free result of the computation.
But more interesting (and as they write, "the main technical contribution of the paper") is how to keep state from leaking. How do you statically prevent the function from just returning the array? (This would be bad because once you're outside of the state-ordered function you're back in no-evaluation-order land and other stateful snippets that operate on the array could happen in any order.)
Here's where the fancy types come in. They define the function:
runWithState :: ∀a. (∀s. ST s a) -> a
Where
ST s a
represents the stateful computation, and a
is the return type.First, an aside on the fancy type:
See the forall nested in the argument? That's called rank-2 polymorphism, and is one of the limitations of the Hindley-Milner type inference (also known as "ML-style type inference"): you can only type-infer rank-1 polymorphism (a rule which is often described as "all the foralls go out in the front"). So the polymorphism there must be written explicitly.But what does it mean? It's easy enough to see that the return type ought to be a forall: I want to be able to run many different stateful computations which can return different types, as long as the computation matches the return type -- it's the same
Typically (and pretty much the whole point of an ML-style type system) polymorphism is inferred for free. A function that swaps the elements of a pair of any type, like:swap (x,y) = (y,x)
has its type inferred and typically written down asswap :: (a,b) -> (b,a)
But what that really means is that it's automatically inferred to have typeswap :: ∀a b. (a,b) -> (b,a)
-- that's what having "variables" in a type means: a universal quantification.
So thisrunWithState
function actually requires the programmer (here, the Haskell standard library author) to explicitly annotate the type. And programmers have enough trouble understanding rank-1 polymorphism in general.
(For more applications of higher-rank polymorphism, see section 2 of Sexy types in action, where LFST gets barely a paragraph's worth of description.)
a
in ST s a
and the result a
.And the computation carries a polymorphic
s
as well, because when I combine two operations I want them to operate over the same sort of state. That is, to combine two operations clearMatrix
and identityMatrix
, the type of the "sequence" operator correctly requires both to use the same type of internal state (presumably something involving a matrix).Finally, the forall inside the argument to
runWithState
means that to be able to run a given stateful computation and get back a non-stateful result, the computation must work for all possible sorts of initial states -- that's what it means to be referentially transparent! If the forall were out in the front, it would be saying that runWithState could be ran for any sort of input state; by putting it in the argument, it's saying that it must be independent of input state and simultaneously that the type of the output cannot depend on it (the quantification of s
stops before we get to the output argument).Polymorphism is weird in that you declare that something is general for all types -- which feels like an expansion of power -- but because you can't know anything about the input type, you're restricted to treating the polymorphic value as a black box: you can't do any operations on it other than pass it around.