# evan_tech

06:29 pm, 1 Feb 09

### zippers

Zippers: there are plenty of docs, so here's some random notes instead. (The original paper is in ML and quite readable.) Zippers are a fine example of the sort of progressive generalization that people play with in Haskell; my favorite blog at the moment is filled with interesting math along those lines.

I think the first zipper I encountered, before I knew the term, was in reading a Brainfuck implementation written in O'Caml. Brainfuck's model is basically a Turing machine, where you have a focal point on a tape and you can move around on the tape and do operations based on the focal point.

The question is: how do you represent this tape data structure with immutable data structures? If you use a list and an offset, examining the focal point is O(n) (to walk down the list to the offset) and modifications are O(n) (to rebuild the list). The solution, which seemed novel enough to me at the time that its obviousness now doesn't dissuade me from describing it for you, is to represent the tape as three parts: cells to the left of you, your current cell, and cells to the right. The trick is that on both sides, the nearest cell is at the front of the list; in some sense the left cell list is backwards.

Now stepping by one cell is O(1): you just pop the top of the relevant side list off and stuff your previous focal point as the new top on the other side. What I didn't fully appreciate at the time is that this is also gentle with memory. You're only popping/adding from the front of a list, so the remainder of the cell lists are constant and shared. Each step is destructuring one cons cell and creating one new one.

So whatever, you might say. You have jump through hoops to get functional data structures and it's a waste of time. Here's where it hopefully gets interesting or at least more complicated.

This idea, called a zipper, can be generalized: your cursor is a pair of the current point along with context representing the remainder of the data structure arranged such that nearby movements are small steps. (In the Brainfuck example the context is the cell lists.)

Most examples discuss movement on a tree. Your current point is an unmodified subtree, and your context is a pointer to the subtree's parent along with the rest of the parent's children (the children other than your current point). This otherwise kind of useless Wikibook article has a nice picture about halfway down visualizing it, where your pointer kinda "grabs" the tree and "lifts it up" so it hangs around that point, which is a nice structural metaphor. In terms of memory, it's sorta like the list zipper in that most of the tree is left alone and can be shared during movement; all the zipper added is a reversed path back to the root.

What struck me recently is that with a tree zipper you've stepped around the problem of parent pointers in a tree while still allowing movement in any direction. The zipper here is effectively just building a stack of parent pointers as you walk down the tree and popping the stack to walk up, which is the sort of trick you've likely used (or at least could have invented) in an imperative programming context to conserve memory while allowing arbitrary tree navigation.

This is again all nice enough -- two related concepts that do similar tricks with data structures -- except it can be generalized even further. A zipper can be defined for any data structure systematically by computing the derivative (in the calculus sense, kinda) of its data type! What's the use? Even the guy who discovered it isn't sure: "While this connection seems unlikely to be a mere coincidence, it is perhaps surprising to find a use for the laws of the infinitesimal calculus in such a discrete setting." More research is warranted, I guess.

•  Have you read Purely Functional Data Structures? I skimmed it back at UNC and thought it was interesting, but never really read it. It could be fun to work through it in a reading group sort of format. reply to this
•  This seems to be interestingly similar to the standard Emacs representation of a buffer (two pieces and the gap at point). reply to this
•  You beat me to it! Except I was going to preface it with "long ago when computers were slow and you couldn't afford to copy thousands of bytes on every user keystroke...", but maybe that implementation still persists today in Emacs.Also, the Scala book I'm reading now explains how Scala uses the same linked-list technique in its implementation of a ListBuffer class with a constant-time append method. reply to this
•  I think it persists! At least, the paper I read recently on the nitty-gritty of implementing an emacs uses it. reply to this
•  One of my favorite papers is Derivatives of Regular Expressions. Not really recent but lots of stuff has been spawned by it. reply to this