So please forgive me for my intro-CS level of discussion, but I was curious about this the other day and enjoyed learning about it.

**the algorithm**

Here's an algorithm (credited to Hoare, not to be confused with

**graydon**) for finding the median of an unsorted array. Actually, it more generally gives you the kth smallest element, and getting the median is just k = length / 2.

- Guess that the kth element is the kth smallest.
- Partition the array into smaller and greater halves than that guess.
- If the smaller half has k-1 elements, the guess was correct.
- Otherwise, the kth smallest element is in one of the two halves, so adjust k and recurse into the appropriate one.

**observations**

The simpler algorithm for this is, of course:

- Sort the array.
- Take the middle element.

*is*quicksort, except that you only recurse into the subdivisions that have the one value you care about. (I guess quicksort was invented by the same fellow, so maybe this isn't a surprise.)

Another way of interpreting this is like a depth-first build of a binary tree: the guess is your current node, the partition picks all the nodes that belong on the left and right branches, and the recursion continues building the tree on the branch.

**implementation**

This (or an equivalent function) is in C++, of course, as the function

`nth_element`

.Here's a Haskell implementation of it using lists. (Yeah, not quite the same, but it carries the idea.)

nth n l|n<ltlen=nth n lt|n>ltlen=nth (n-ltlen-1) gt|otherwise=pivotwhere(pivot,l')=select n l (lt, gt)=partition (<pivot) l' ltlen=length lt

`partition`

is a builtin; `select`

is a function that pulls out the nth element from a list.**verifying**

But how do I know it works? There's a neat library called "QuickCheck" that does randomized testing: you tell it what sorts of inputs your function expects and a way to test whether the output is correct, and it randomly generates inputs to verify proper functioning.

QuickCheck is one of those libraries that seems to work by magic: no code preprocessing or introspection, just (as always) fancy types.

For example, here's how I write the requirement for

`nth`

. I rely on the builtin `sort`

to give me the correct answer.prop_Nth n lThis is just a normal function definition, but you can read the first line as "the property Nth holds for inputs n and l, when ..."=nth n l==sort l!!n

(Here,

`!!`

is the nth-element-from-a-list operator. I assume it has a scary-looking name because it's O(n).)And then you check it with simply:

`quickCheck prop_Nth`

And QuickCheck can figure out (through type inference, again) that this particular property takes an integer and a list, and automatically generates test cases and runs them.

When I first ran this it found all sorts of places where it'd fail that would be easy to forget with a unit test: for example, when the input list is empty, or when n is not in the bounds of the list. So they provide all sorts of fancy higher-level operations that let you modify the random-input-generators or restrict the inputs, or even have it provide breakdowns of the sorts of inputs it gave ("32% three-element lists.") The manual is hokey-looking but quite impressive.

**P.S**

Here's an interesting thread about how laziness affects big-O.