Evan Martin (evan) wrote in evan_tech,
Evan Martin

hexpat microbenchmarking

There was this thread on Reddit about my release of those Expat bindings. They were, in fact, motivated by speed. I wrote a little microbenchmarking library and found that for the particular file formats I care about, Expat is an order of magnitude faster compared to HaXml:
$ ./perf
input is 325709 bytes.
* HaXml: ...
3.06 per second.
* hexpat: .......
37.31 per second.

[Updated: using ByteStrings brought this up to around 42 on my machine...]

(These comparisons, are, of course, not at all fair; HaXml provides a lot more functionality and a DOM-like API. But for my project I don't care about all that. I just want it to be fast.)

I spent quite a while trying to figure out how to get GHC to re-run a pure function (for benchmarking purposes) and eventually gave up. It seems you can make it work sometimes but the optimizer likes to say "oh, we already computed that". It seems there ought to be some pragma related to this but I couldn't figure it out.
Tags: haskell, project

  • blog moved

    As described elsewhere, I've quit LiveJournal. If you're interested in my continuing posts, you should look at one of these (each contains feed…

  • dremel

    They published a paper on Dremel, my favorite previously-unpublished tool from the Google toolchest. Greg Linden discusses it: "[...] it is capable…

  • treemaps

    I finally wrote up my recent adventures in treemapping, complete with nifty clickable visualizations.

  • Post a new comment


    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.