evan_tech

Previous Entry Share Next Entry
09:01 am, 10 May 05

greasemonkey index

People (myself included) have requested a way for Greasemonkey to tell you whether it has scripts that apply to the page you're currently viewing.

One proposal was that it could send the URL you're at to a server that could look it up. That's consistent with the XmlHttpRequest-style of Greasemonkey but terribly broken:
  • for privacy reasons, you don't want to send every URL you visit to a central server;
  • a central server couldn't handle the load of a bunch of people hitting it with every page.

It seems the right solution is to let clients periodically download an index.

Greasemonkey scripts provide regular expressions that match the URL when the script could run. So how would you build an index that could quickly find which script out of many would match? The straightforward way is to stick every regex in an array and then iterate through it, but that's O(n) and n's increasing rapidly.

Random thought: I wonder if it'd be worth building a tree of the regexes. At the top, a huge OR of all script regexes, and then two branches, each containing half. That shrinks the O() but not to log(n) 'cause you don't know which branch to take at each level... but at least most of the processing is happening inside the regex engine and not in Javascript. Or maybe you could wrap each one in parenthesis in the huge OR and check which group number matched...