Evan Martin (evan) wrote in evan_tech,
Evan Martin

researcher of the moment: rada mihalcea

I saw a great talk today by Rada Mihalcea on some of her research. The gist is interpreting text as a graph and applying PageRank*-like computations. Depending on what your graph represents, different applications include word sense disambiguation (nodes: possible senses (from wordnet) of words, edges: connections between senses), text summarization (nodes: sentences, edges: weighted by sentence similarity), and more.

Her work outperforms the current best-performing algorithms, even existing supervised algorithms while this approach is unsupervised. Awesome.
[update 19-oct-05] A note for gawkers coming from the unofficial Google weblog: this last paragraph was merely restating her conclusions and I haven't vetted them myself, nor was I involved in the grant-giving process. After this post, I read a few of her papers and then mostly forgot about the subject.

Rada referenced an algorithm by Jon Kleinberg, one of researchers I try to follow more closely. I ought to make a reading list for y'all; there are fewer than ten I've found who consistently produce introducing work.

Additionally, clevercs just updated a bunch of posts, and there's a bunch of promising-looking stuff going on in there. In particular they linked to Recovering Device Drivers, which just got Best Paper at OSDI '04. That work was done at the University of Washington! Hank Levy taught my operating systems class, and I think Michael Swift is the guy I had a decently long talk with at a poster session a year or so ago.
Regarding OSDI '04: The other Best Paper is on model checking, which I know little about but I think goes back to language theory. (And also, if you're looking to read any of these papers, check Scholar or mail me; I may be able to find a link for you.)

* (TM), heh. I didn't realize it until after I started here that Larry named the algorithm after himself. It was a nice coincidence that it applies to web pages, too.

  • your vcs sucks

    I've been hacking on some Haskell stuff lately that's all managed in darcs and it's reminded me of an observation I made over two years ago now (see…

  • ghc llvm

    I read this thesis on an LLVM backend for GHC, primarily because I was curious to learn more about GHC internals. The thesis serves well as an…

  • found my bug!

    Not too interesting, but this has been bugging me for a week. Been working on a toy program that proxies a TCP connection. It was working fine for…

  • Post a new comment


    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.