evan_tech

Previous Entry Share Next Entry
04:13 pm, 17 Aug 06

data types are not objects

One question I can remember students asking, when faced with ML, is something like: "If the language doesn't have objects, how do you provide something like a reusable binary tree?" Without objects, that is, how do you package things into meaningful components? This is a good example of how knowing only one way of looking at the world frames your perception of it, and I like to think that it's one of the reasons forcing college kids through a programming languages course is worthwhile.*

Talking about what is and isn't OOP is too slippery to be able to say much that's meaningful, but hopefully this one lesson is worth keeping in mind: abstract data types do not require objects.

In C, for example, there is an abstract data type that represents files. (By abstract data type (ADT) I mean that when you have a FILE* or file descriptor the only useful operations you can do with it (open, read, write) are those that the author of the file libraries provided for you.) And I don't think anyone would argue that C is object-oriented.

You might respond to this by saying, "well, that's because the unix file interface is object-oriented," but from that you might as well claim every language then supports OO, or that object-oriented is just a design pattern. Instead, we can call the package of (abstract type)+(operations that work on that data type) a "data structure" or ADT and then reconsider objects to see what's left.

And there is where it gets slippery so I won't volunteer much. One point of view is OO is just more pleasant syntax for ADTs. Another point of view, from the type-oriented language-semanticist perspective, is that OO adds dynamic dispatch on the "message receiver" type. In O'Caml (which is known for supporting OO -- that's why they added the "O", after all) most data structures, such as hash tables or sets, are implemented just as an ADT. Objects by contrast allow you to extend them through subclassing. (I can't, for example, create my own type of file object in C and have fread work for it. On the other hand, unix does exactly that internally in supporting sockets and files with the same interface. Here, they've used C to implement dynamic dispatch underneath but the interface exposed to the C library remains a plain ol' abstract data type.)


It turns out that trying to shoehorn everything into the OO style works really well in many cases (I still think Ruby is great) but not in others. OO especially starts getting shaky when you have multiple objects involved. To zip three arrays in Ruby, it's array1.zip(array2, array3), which feels quite strange. Really, zipping is an operation that works over multiple arrays, not a method on one of them. By contrast, the C "sendfile" interface is quite sensible, with a function that takes both a "from" file and a "to" file.

In C++, to overload the "+" operator to take your object on the right side also must be implemented as a function external to the class. This feels clunky but it comes from that our underlying understanding of plus is as an operator of two arguments, not a method of an object.

Another example: to join a list of words with a separator in Python, it's sep.join(list), while in Ruby it's list.join(sep). Fans of each will slam the other: the Python code reads terribly and is counterintuitive, while the Ruby code feels more natural but sticks a string-specific method on all arrays. Again, one solution is to realize that string-joining is an operation that involves both a string and an array, and make it external to both.

It turns out that once you're willing to accept operations as external to the data types they work with, you can do some pretty interesting things. I'll get to this later.

* On the other hand, I'm not sure if they really get the lesson out of it, so maybe it's not worth it.