Evan Martin (evan) wrote in evan_tech,
Evan Martin

sorry, specification!

Tonight I was talking with a coworker about search quality, and I said something like: "There are so many cases of where there's the right way to do something, and then there's the way that works around all the bugs, and they always have to do the latter." Underneath I guess there's a philosophical issue (blah blah blah Postel's law) but I think I prefer to look at it as being descriptive instead of prescriptive. (Which doesn't mean I necessarily like it.)

One of my (least) favorites is page encoding: since pretty much nobody can get their webserver headers right, and many people can't even get the encoding explicitly stated in the page itself right (see also: Windows-1252 in purportedly-ISO-8859-1 XML feeds), browsers basically just ignore all those and guess at what encoding the page really is. And then, since people test their pages with their browsers, to get the best coverage Google has to match the browsers' detection behaviors. Sorry, HTTP and HTML encoding specifications!

Another one is displaying XHTML in search results: it doesn't do much good to show a search result with a link directly to an XHTML document if Internet Explorer is just gonna give you a download box. Sorry, XHTML!

So it's nice to see that Matt and them have discovered(?) the Google bug that's been cropping up in my work lately, too: briefly, it appears that when using a persistent connection to a single host but varying the Host: header (like to a single site with virtual hosts), some hosts get confused and feed you the wrong pages. I dunno what the fix is, but I'd guess they have to make separate connections for different hosts on the same IP. Sorry, persistent connections!
Tags: google, personal

  • megaupload captcha

    Someone make a Javascript-based captcha cracker for megaupload. It's strange to see those captchas again because I idly myself wrote a…

  • zombie ghosd

    I was tickled to discover another IBM developerworks article on one of my abandoned hacks and that both it and its predecessor have been translated…

  • gat, a git clone in haskell

    I've been pretty busy with work lately, so I may as well dump this on the internet before it gets too dusty. Though I think I understand Git decently…

  • Post a new comment


    default userpic
    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.