evan_tech

Previous Entry Share Next Entry
02:47 pm, 1 Mar 08

hexpat microbenchmarking

There was this thread on Reddit about my release of those Expat bindings. They were, in fact, motivated by speed. I wrote a little microbenchmarking library and found that for the particular file formats I care about, Expat is an order of magnitude faster compared to HaXml:
$ ./perf
input is 325709 bytes.
* HaXml: ...
3.06 per second.
* hexpat: .......
37.31 per second.

[Updated: using ByteStrings brought this up to around 42 on my machine...]

(These comparisons, are, of course, not at all fair; HaXml provides a lot more functionality and a DOM-like API. But for my project I don't care about all that. I just want it to be fast.)

I spent quite a while trying to figure out how to get GHC to re-run a pure function (for benchmarking purposes) and eventually gave up. It seems you can make it work sometimes but the optimizer likes to say "oh, we already computed that". It seems there ought to be some pragma related to this but I couldn't figure it out.