09:17 pm, 16 Nov 07
content addressable
I recently read the Venti paper, which made me a little sad as all Plan9 stuff does because there were so many interesting ideas there that have all been forgotten.
Here's a thought: if links on the web were to SHA-1s instead of hostnames and paths, you could (a) be assured that the content on the other side of the link was always exactly what you linked to, and (b) reliably handle mirroring for free (anyone could replicate the data of a SHA-1 and you could verify it was a correct mirror). The only sticky part is resolving a SHA-1 to an IP that would return the data you requested, but there's a lot of research (DHTs, even DNS) in that space. The major weirdness is that it'd be impossible to "update" a page without everyone adjusting their links.
You could imagine creating links of the form
Here's a thought: if links on the web were to SHA-1s instead of hostnames and paths, you could (a) be assured that the content on the other side of the link was always exactly what you linked to, and (b) reliably handle mirroring for free (anyone could replicate the data of a SHA-1 and you could verify it was a correct mirror). The only sticky part is resolving a SHA-1 to an IP that would return the data you requested, but there's a lot of research (DHTs, even DNS) in that space. The major weirdness is that it'd be impossible to "update" a page without everyone adjusting their links.
You could imagine creating links of the form
sha1://[hex string], confident that in the future someone would come up with a way of resolving it. On the other hand, you can also be confident that SHA-1 collisions will be found, so maybe it's not so useful for archival at the web scale.
You could simply start writing <a href="..." sha1="f1d2d2f924e986ac86fdf7b36c94bcdf32
the usual solution for mutability
Usually in systems that support naming by SHA-256 or whatever, there's support for mutable documents that are named by the SHA-256 of a public key that authenticates signatures on updates to them. This makes the system somewhat more dependent on the ability of new updates to propagate and supersede old versions, and opens the possibility of "replay attacks" where someone maliciously feeds you an old copy of some resource.See "Wax and the Eternal Resource Locator", or whatever that paper is. Also Zooko's Triangle.
forgotten?
Erm, I hardly think these ideas are forgotten.What do you think all those magnet URIs or CHF-based revision IDs are?
It's just not a great idea for most web pages, as they are constructed on the fly out of bamboo and snot.
(Though this has not stopped Gerv from proposing it)
Reading the linked paper, it looks as if they come up with a function that can manufacture collisions for inputs based on their particular magic format. This doesn't mean it's possible to manufacture collisions, though, unless your original data matches their magic format. If you could figure out how to generalize what they were doing to a range of different constants, then you'd be on to something!
Related work: FreeNet's addressing schemes.
You may be thinking of Coraid, who make Ethernet based storage boxes and use Plan 9 as the internal operating system on them.