evan_tech

Previous Entry Share Next Entry
There's a funny document titled "evolution of a Haskell programmer", where it shows a simple problem and satirizes how it would be solved by Haskell programmers of varying skill. (The better they get, the more involved the implementation becomes.) I just had a moment like that myself, and it was totally awesome and I'd like to share it with you.

I fear everyone's eyes glaze over whenever I mention Haskell, but just in case you're still with me, I'm gonna explain things relatively simply. The main thing you need to know is that functions are applied without parenthesis: "f foo bar" is like "f(foo, bar)" in other languages.

Here's the toy problem (a simplification of something I was actually working on).
You have two functions:
  1. fromString: takes a string and gives you back either the integer that the string was representing or nothing (if the string didn't contain digits). In Haskell this function of type String -> Maybe Int, where the "Maybe" means that it can return either "Nothing" or "Just 3" (or any other integer). You can think of it like returning either a value or NULL in C.
  2. double: takes an integer and doubles it. (This is not the real program I was working on, obviously.
I want a function doubledFromString, that either gives me back Nothing or Just twice the value of whatever the string held.

So, the evolution:
  1. The first way to do it is just like you'd do it in ML, and pretty much like you'd do it in C, Perl, Python*, etc.:
    doubledFromString s =
      let num = fromString s in
      case num of
        Nothing -> Nothing
        Just x -> Just (double x)
    First you decode the string, then depending on whether you got a number back you either produce Nothing or you double it.
  2. The first modification is doing it functionally. There's a function "maybe" (note the small m) that takes three arguments: a Maybe, a value to return if the Maybe holds Nothing, and a function to apply to the value in the Just if it's a Just. The arguments are in a different order, so it'd best be illustrated by examples:
    maybe 0 double (Just 3) produces 6.
    maybe 0 double Nothing produces 0 (the first argument is the "fallback").

    So I can use maybe to reduce this down to a single line:
    doubledFromString s = maybe Nothing (Just . double) (fromString s)
    That's saying: if fromString produces Nothing, produce Nothing. Otherwise, apply double to the result, and then Just to that. (The period is function composition. Remember math? The thing that programming came from?)
    I don't like this solution 'cause it's sorta hard to read.
  3. Maybe is a monad. Monads are vaguely ways of expressing sequences of computation. The Maybe monad which has the basic rule that if any step along the way produces Nothing, then the whole thing is Nothing. This is a common pattern in programming... it reminds me of my Windows programming days, where you tend to write code like:
    ret = NULL;
    hFoo = GetFoo(...);
    if (hFoo) {
      hBar = GetBar(hFoo, ...);
      if (hBar) {
        hBaz = GetBaz(hBar, ...);
        if (hBaz) {
          ret = UseBaz(hBaz);
          Release(hBaz);
        }
        Release(hBar);
      }
      Release(hFoo);
    }
    return ret;
    Yikes. I've seen that pattern so many times...

    So here's the function using the Maybe monad. Haskell has special syntax for monads, but since so many things are monads you get to use it everywhere.
    doubledFromString s = do
      x <- fromString s
      return (double x)
    The "do" keyword introduces monadic code. Having this use the Maybe monad means that if fromString returns Nothing, the whole thing stops there.
    I really like this solution; it's easy to follow, and adding more intermediate steps (like the GetBar above) would just be an extra line in the middle -- no more nesting or if statements. To a C programmer's eyes it might look like the programmer forgot about error checking but the type system wouldn't compile it if the code didn't handle the Nothing possibilities properly.
  4. But it's still verbose. This "do" stuff is just syntactic sugar for the more basic operation of sequencing, and so the above can also be expressed as:
    doubledFromString s = fromString s >>= return . double
    where the crazy-looking >>= operator means "take the output from the left and put it into the right". (Again, since it's in the Maybe monad, if fromString returns Nothing the whole thing stops there and returns Nothing.)
    This one is ok, but it looks like solution #2 in that it's a little hard to follow.
  5. But the better takeaway is this. The operation you're trying to do can also be looked at as: I want do run this "double" function inside the Maybe monad, effectively doubling the value if there is one or skipping it if there isn't. So there's a function "liftM" that takes a normal function (double just maps from integers to integers) and "lifts" it to work in a monad. So we can instead write it as:
    doubledFromString s = liftM double (fromString s)
    That is, I "lift" double to work inside the Maybe and apply it to the result of fromString.
    This solution is great.
  6. You can actually do one better:
    doubledFromString = liftM double . fromString
    That says: doubledFromString is the function which first applies fromString, then applies double within the monad.
    This solution is awesome.
Where am I in this evolution? I started with #3 and worked my way forward; I added the earlier steps to show where I've come from. I'm aiming for #6. :)

* Dutiful Python programmers would probably use exceptions, which are definitely nicer, but then you have the problem of knowing which functions throw and which don't and neither the type system nor the function definitions tell you that without you checking through all functions transitively called...