10:57 am, 6 May 05
web accelerator
What do you all think of the web accelerator?
Looking around on the web I see some accusations of it being unglamorous, but as far as I can tell it's just the sort of thing Google's strong at: large distributed systems, web access, etc. And there is plenty of room for interesting algorithms.
I hadn't really thought hard about it until I learned we were making this thing, but there's a lot more than simply caching if you intend to do it well. For example: "differential compression of web pages" means that if you know the client already has an old version of a page, you can potentially just send them a diff from the old to the new page.
I've already seen one bug report that's a little depressing: link prefetching may "click" on links like "delete this item". How? Because webmasters don't know the difference between GET and POST. Uninformed users always get in the way of nice technology. :)
Looking around on the web I see some accusations of it being unglamorous, but as far as I can tell it's just the sort of thing Google's strong at: large distributed systems, web access, etc. And there is plenty of room for interesting algorithms.
I hadn't really thought hard about it until I learned we were making this thing, but there's a lot more than simply caching if you intend to do it well. For example: "differential compression of web pages" means that if you know the client already has an old version of a page, you can potentially just send them a diff from the old to the new page.
I've already seen one bug report that's a little depressing: link prefetching may "click" on links like "delete this item". How? Because webmasters don't know the difference between GET and POST. Uninformed users always get in the way of nice technology. :)
I noticed FogBUGZ has some links that do actions. (Subscribing/unsubscribing from email alerts to bug changes....) I thought Joel was a smart guy.
Anyway, about the web accelerator: I guess I think that mostly it's just dull and not interesting (cacheing proxy, whee) but the subpoena-fodder security implications are completely horrible. It's even more egregious than the fact that Google shares cookies between search and mail.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html I'm not saying your UI has to be a form button. You could make it a form image or even a button styled as text, or text that does a JS submit, but it SHOULD make the browser do a POST, not a GET/HEAD.
Otherwise, aside from prefetching:
- search engines will crawl your gronk buttons
- malicious users can put <img src='buttonurl'> in their LJ posts
(The latter has been a problem with both LJ and Orkut.)
Its prefetching? I thought it was caching on a large scale like aol does.
I did some packet sniffing to disprove some of these claims, but a lot of it is compressed, so it isn't easily discernable what Google is sending you.
A page explaining how it caches things and the rules it uses would be nice.
I get the odd feeling that's going to be considered "proprietary" information, and therefore won't even hit the "How to keep program and spec in-sync?" stage of consideration =\
It doesn't have to be detailed -- just something along the lines of, "we respect Cache-Control headers."
I've been using it for a few days. It doesn't really seem to have made much difference for me, but maybe that's due to the stuff I visit regularly being essentially uncachable, plus I have a local caching proxy here anyway.
I think it would be quite neat to run it as a network-wide proxy so that everyone can share the benefit. I don't know if the local cache respects cache-control: private or not, though.
In fact, I just checked its “Performance” page and it claims to have saved me zero seconds. I guess that means that it's not actually managing to achieve anything at all for me.