Evan Martin (evan) wrote in evan_tech,
Evan Martin
evan
evan_tech

  • Mood:

ocaml zen

I have achieved a state of OCaml zen.

I wanted to write something that would parse LJ's latest-rss feed, my favorite source of weird data. I wanted something fast, so I figured I could use OCaml. Google for "ocaml fast xml" unfortunately finds one of my evan_tech posts, but I hear expat is good, so I hooked everything together.

Everything that caught me before—the confusing error messages, the way "if" statements need to have a "begin..end" around the body if you have more than one statement within them, the standard libraries, the way semicolon is used between statements and not at the end of every one, benchmarking multiple sets of equivalent functions by making them returned by interchangable wrappers, and even the Makefile hackery to generate bytecode or x86 code—all of it fell before my flying fingers.

And Christ, expat is way faster than I expected. This program retrieves and prints (including stringing together multiple character bodies in an automatically-growing buffer) all of the bodies of all the <description> tags in a faked (I repeated the content a few times within) 2.6mb (around 1900 entry) XML document:
% time ./ex > /dev/null
./ex > /dev/null 0.13s user 0.01s system 76% cpu 0.182 total

and oh wait, that's the bytecode version. Native code:
% time ./ex > /dev/null
./ex > /dev/null 0.07s user 0.00s system 82% cpu 0.085 total

That number is small enough to be processor noise. I was planning to run this on danga, because that'll make the bandwidth free, and I was worried about wasting processor, but I think something that 0.085 seconds running every 10 or so seconds is acceptable. (Of course, this will be a lot more complicated when I'm done.) I'm way too accustomed to using DOM parsers.
Subscribe

  • memcache job offers

    I get occasional recruiter spam that specifically calls out "my work on memcached". This is pretty funny because all I did was make some trivial…

  • application stack

    "Put yourself in 1995. I'm going to tell the you of 1995 that in 2010, there will be a software platform with the following properties:" Luis Villa…

  • bsd license advertising

    Did you know that the 3-clause BSD (that is, the one with the "advertising" clause stripped) license still has an advertising requirement? Read it…

  • Post a new comment

    Error

    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 2 comments