But it's worth noting that (at least according to Franz's papers from before Google; I don't know much about what they're actually doing here) part of his approach is use the BLEU score as the objective function in their learning. This does make sense: the BLEU score was designed to correlate with human translation quality and so it's a reasonable function to optimize. And the sentences they were given to translate must have been entirely separate from all available training data. But still, it feels a little weird to me that you'd optimize on the metric used to judge; it means you can make "simple" translation mistakes (at least to a human observer) but still get a good score as long as the scoring function doesn't account for those sort of mistakes.
NIST results
But it's worth noting that (at least according to Franz's papers from before Google; I don't know much about what they're actually doing here) part of his approach is use the BLEU score as the objective function in their learning. This does make sense: the BLEU score was designed to correlate with human translation quality and so it's a reasonable function to optimize. And the sentences they were given to translate must have been entirely separate from all available training data. But still, it feels a little weird to me that you'd optimize on the metric used to judge; it means you can make "simple" translation mistakes (at least to a human observer) but still get a good score as long as the scoring function doesn't account for those sort of mistakes.
-
blog moved
As described elsewhere, I've quit LiveJournal. If you're interested in my continuing posts, you should look at one of these (each contains feed…
-
dremel
They published a paper on Dremel, my favorite previously-unpublished tool from the Google toolchest. Greg Linden discusses it: "[...] it is capable…
-
treemaps
I finally wrote up my recent adventures in treemapping, complete with nifty clickable visualizations.
- Post a new comment
- 3 comments
- Post a new comment
- 3 comments