evan_tech

Previous Entry Share Next Entry
I saw this pattern in some work code but I find it applying in a bunch of places. (I'm not much of a C++ wizard so maybe this is old news to you, but it impressed me.)

This comes up in situations where you have a function that wants to take a bunch of parameters of varying types: imagine printf, or something that wants to return an unknown number of parameters. Here's an example of the latter: you've just retrieved a row from a database, and you know that the first element is a string, the second an int, etc. The typical thing to do (which I do in LogJam, for example) is write it out:
// imagine "SELECT id, user FROM ..."
int id = sqlite3_column_int(stmt, 0);
const char *user = sqlite3_column_text(stmt, 1);

Tedious, and you have to update the numbers if you rearrange the select.

Here's the pattern. First you make a class that can be constructed with all the different types you'd like to accept:
class OutParam {
  // my kingdom for a discriminated union
  enum { UNUSED, INT, STRING /* etc */ } type_;
  union {
    int *int_out_;
    string *string_out_;
  };
  OutParam() : type_(UNUSED) {}
  OutParam(int *i) : type_(INT), int_out_(i) {}
  OutParam(string *s) : type_(STRING), string_out_(s) {}
  void get(const Row &row, int col) {
    switch (type_) {
    case UNUSED: break;
    case INT: *int_out_ = row.get_int(col); break;
    case STRING: row.get_string(col, string_out_);
    }
  }
};
static const OutParam UNUSED;
Then you can provide a function that accepts some of these objects and fills them out accordingly:
class Row {
  // get_int and friends wrapping the sqlite API
  void get(OutParam p0, OutParam p1=UNUSED,
           OutParam p1=UNUSED, OutParam p2=UNUSED) {
    p0.get(*this, 0);
    p1.get(*this, 1);
    p2.get(*this, 2);
    p3.get(*this, 3);
  }
};
This now allows the user to retrieve all the fields in one go while retaining static typing:
int id; string user;
row = /* something that runs the select, as above */;
row->get(&id, &user);
If you wanna see the full implementation, browse my repository.

(Note that this is more powerful than the tuple-returns you see in today's high-level languages, because it knows which fields are ints and which are strings.

I feel like I've written code like this a million times:
foo, bar, baz = ARGV[0].to_i, ARGV[1], ARGV[2]
where
foo, bar, baz = ARGV
would almost do except for the to_i.)

I suppose I really ought to read through the boost code before I get too self-congratulatory about this (I just found this in someone else's code, so my only real pride in it is taking the effort to type it out). And now that I think about it, monotone is written by C++ hax0rs and uses sqlite so they probably do something clever there, too...


I recently saw a similar pattern a vaguely related idea that's even fancier in Haskell. Haskell lets you overload functions by return type, which is even weirder with type inference because it means that the behavior of a function call can effectively depend on how you use the variables in the code afterwards. So their library let you write code like:
let (verbose, filename) = getOptions ('v', 'f')
when verbose $ putStrLn "Starting up..."
open filename    {- or whatever -}
Since verbose is used in a when test, the compiler knows it's a Bool, which means that that getOptions knows that the v flag takes no argument. Similar reasoning lets it conclude that the f flag requires a string argument, and so getOptions can nearly magically know which of the flags you give it need which sort of arguments and do the appropriate error-checking and reporting.

It's a pretty sweet and sick hack. (They do some similarly sick stuff for flag documentation.) Unfortunately the implementation is a bit on the sick side too...