Evan Martin (evan) wrote in evan_tech,
Evan Martin

researcher of the moment: rada mihalcea

I saw a great talk today by Rada Mihalcea on some of her research. The gist is interpreting text as a graph and applying PageRank*-like computations. Depending on what your graph represents, different applications include word sense disambiguation (nodes: possible senses (from wordnet) of words, edges: connections between senses), text summarization (nodes: sentences, edges: weighted by sentence similarity), and more.

Her work outperforms the current best-performing algorithms, even existing supervised algorithms while this approach is unsupervised. Awesome.
[update 19-oct-05] A note for gawkers coming from the unofficial Google weblog: this last paragraph was merely restating her conclusions and I haven't vetted them myself, nor was I involved in the grant-giving process. After this post, I read a few of her papers and then mostly forgot about the subject.

Rada referenced an algorithm by Jon Kleinberg, one of researchers I try to follow more closely. I ought to make a reading list for y'all; there are fewer than ten I've found who consistently produce introducing work.

Additionally, clevercs just updated a bunch of posts, and there's a bunch of promising-looking stuff going on in there. In particular they linked to Recovering Device Drivers, which just got Best Paper at OSDI '04. That work was done at the University of Washington! Hank Levy taught my operating systems class, and I think Michael Swift is the guy I had a decently long talk with at a poster session a year or so ago.
Regarding OSDI '04: The other Best Paper is on model checking, which I know little about but I think goes back to language theory. (And also, if you're looking to read any of these papers, check Scholar or mail me; I may be able to find a link for you.)

* (TM), heh. I didn't realize it until after I started here that Larry named the algorithm after himself. It was a nice coincidence that it applies to web pages, too.

  • fonts on linux

    I wrote a document on how to diagnose font problems on Linux. I would appreciate feedback, corrections, other common misconfigurations, etc.

  • socks5 proxying flash via ssh

    Suppose you're in Germany and want to watch some Flash-based videos that are IP-limited to the US for whatever reason. At first you'd think you could…

  • chromium.el

    This weekend I wrote some Emacs Lisp to write some utility functions I find useful for hacking on Chromium. It's fun to have a reason to use Lisp!…

  • Post a new comment


    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.