# lazy big-O

Concatenating two linked lists of length m and n in your standard functional languages is a common operation understood to be O(m).

This gets weird in a lazy language. Here's an implementation of a standard concat function, for reference:
```concat (a:as) bs = a : concat as bs
concat []     bs = bs```

If I write code like:
`x = concat l1 l2`
I get back a lazy value like:
`{a0 : concat a1.. bs}`
(I'll use curlies around code to represent a suspended computation, and subscripts to index into the lists). This is an O(1) operation, at least wrt to the list constants m and n.

If I later try to get the first element of `x`, it's forced to evaluate a single step and produce the first element of the concatenation. This is again O(1), leaving `x` as:
`a0 : {a1 : concat a2.. bs}`

By that logic, concatenating the lists is O(1), but you haven't shifted the time elsewhere -- you've added O(k) to accessing the kth element of `x`, but that was O(k) before concatenation was involved. It's like the concatenation happened for free! (Except it's not really free, of course; the concat created an an extra O(m) new conses.)

In terms of big-O algorithm time though, the effect is constant and dropped. This is one of the reasons laziness can (almost paradoxically) lead to more efficient code: you only pay for the bits you actually use. If the bit you use is significantly smaller than the total list -- small enough to offset the (large) additional cost of introducing and evaluating all these thunks -- then you've saved time.

But you still need to take care. If you have a list of lists to concatenate, then you can make it inefficient if you concatenate the "wrong way":
`concat (concat (concat L0 L1) L2) L3`
There, getting to the elements in the leftmost list goes through O(length(L)) layerings of `concat` before it reaches an element.

Instead, concatenating "on the other side":
`concat L0 (concat L1 (concat L2 ...))`

This difference between these two recursion schemas (which have the same result for associative operators) is the difference between `foldl` and `foldr`, which are demonstrated in a amusingly interactive way via foldl.com and foldr.com. In Haskell, the list-of-lists concatenation function is itself called `concat` (the concat function I defined at the beginning of the post is just `++`), and it's defined in the Haskell prelude as `foldr (++) []`.

PS: Rubyists, you may recognize `foldl` as `Enumerable#inject`; I don't think Ruby has a `foldr`. My impression is that Ruby's gratuitously-different names for functional operators (like `select` for `filter`) date back to Smalltalk.
Tags:
• #### blog moved

As described elsewhere, I've quit LiveJournal. If you're interested in my continuing posts, you should look at one of these (each contains feed…

• #### dremel

They published a paper on Dremel, my favorite previously-unpublished tool from the Google toolchest. Greg Linden discusses it: "[...] it is capable…

• #### treemaps

I finally wrote up my recent adventures in treemapping, complete with nifty clickable visualizations.

• #### Error

default userpic
When you submit the form an invisible reCAPTCHA check will be performed.