[tahoe-dev] Surely M$ can patent this process?
zooko
zooko at zooko.com
Sun Jan 27 08:18:50 PST 2008
[adding Cc: p2p-hackers and cryptography mailing lists as explained
below; Please trim your follow-ups as appropriate.]
Dear Gary Sumner:
On Jan 26, 2008, at 9:44 PM, Gary Sumner wrote:
> I was researching on the weekend and came across Tahoe…very
> exciting and can’t wait to delve in and understand more in detail.
>
> I was reading over Plank’s work around erasure encoding and that
> lead me to Tahoe. One thing that I was really looking for was to be
> able to encrypt the data before storing it and so was very excited
> when I read your architecture doc and it says “When a file is to be
> added to the grid, it is first encrypted using a key that is
> derived from the hash of the file itself.” This seems perfectly
> logical and natural way to apply this technique. However,
> researching also lead me to a patent M$ has been granted on this
> exact process:
>
> Encryption Systems and Methods for Identifying and Coalescing
> Identical Objects Encrypted with Different Keys - http://
> patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%
> 2Fnetahtml%2FPTO%2Fsearch-
> bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=6983365.PN.&OS=PN/
> 6983365&RS=PN/6983365
>
I haven't read that patent, so I can't say whether it applies to what
allmydata.org Tahoe does or not. By default, for immutable files
(but not for mutable files or directories), Tahoe sets the encryption
key equal to the tagged hash of the file contents. (A tagged hash is
simply a hash of the data prefixed by a tag to distinguish it from
other uses of hash functions). You don't have to use Tahoe this way,
however:
> The encryption before storing is critical for my application.
>
If, for any reason, you don't want to let your encryption key be
produced from the secure hash of the file contents, then Tahoe can
instead use a randomly-generated encryption key. The drawback of
doing it this way -- with a random encryption key -- is that you lose
the "deduplication" feature: two people who independently store the
same file contents will use twice as much space, instead of each of
them having a pointer to a single stored copy. The advantages of
doing it with a random encryption key are that you get a stronger
guarantee about the confidentiality of the contents of your files,
and it is faster as you don't need to process the whole file (in
order to generate the encryption key) before beginning to upload the
file.
> Surely there must be prior art on this technique to refute this
> patent?
>
That's an interesting question, and I'm carbon-copying the p2p-
hackers and cryptography mailing lists to ask if anyone knows. I
learned about this technique from Jim McCoy and Doug Barnes in their
design of Mojo Nation. I don't remember whether this technique was
mentioned in Jim McCoy's personal communication of Mojo Nation to me
in the summer of 1998, but it was definitely present in the design
when I started working for Jim and Doug on Mojo Nation in 1999, and
when Mojo Nation was first announced to the world at DefCon in July
2000 [1, 2]. I don't know if Jim came up with the idea ex nihilo or
was exposed to it in the swirling soup of ideas that we lived in at
the time: cypherpunks / Electric Communities (which had many ideas
gleaned from Xanadu) / Financial Cryptography / etc..
I remember reading about the newly announced Freenet project in 2000
and being surprised at how many similarities its design had to our
unannounced Mojo Nation project. The influential Freenet paper [3]
was published in July, 2000 -- one month too late to count as prior
art for that patent, which was filed May 2000. However, that paper
was based on Ian Clarke's master's thesis, which was published in
1999. Let's see... A there it is: [4]. Hm, no it does not seem to
contain the notion that the 2000 Freenet paper would popularize as
"Content Hash Keys".
I've also just now re-read The Eternity Service (Anderson, 1996) [5],
and it, like Clarke 1999, omits details of encryption.
It's an interesting puzzle of intellectual history. The idea
certainly seems to have been "in the air", as both Mojo Nation and
Freenet were working on it before the May 2000 patent submission by
Doceur et al., but Mojo Nation and Freenet each published the idea
shortly after May 2000. According to my limited understanding of
patent law, this means that they don't count as prior art on that
patent.
Regards,
Zooko
[1] http://www.mccullagh.org/image/950-12/jim-mccoy-mojonation.html
[2] http://web.archive.org/web/20001118214000/http://
www.mojonation.net/docs/technical_overview.shtml
[3] http://citeseer.ist.psu.edu/420356.html
[4] http://citeseer.ist.psu.edu/380453.html
[5] http://citeseer.ist.psu.edu/anderson96eternity.html
More information about the tahoe-dev
mailing list