<span style="font-family: Arial, Helvetica, sans-serif; font-size: 10pt">Regarding encrypted file stores and deduplication, SpiderOak published a good article on file-level deduplication of encrypted files:<br />
<br />
https://spideroak.com/blog/20100827150530-why-spideroak-doesnt-de-duplicate-data-across-users-and-why-it-should-worry-you-if-we-did<br />
<br />
Wuala seems to use the method SpiderOak cautions against. When a user tries to upload a file, the client app encrypts it, hashes it, and asks the network if an encrypted file already exists with the same hash. If so, the existing file is linked into the user's account (no upload needed!). It's a neat concept, but it has one big disadvantage: the network can see each user who is sharing a file with a given hash.<br />
<br />
So global file-level deduplication = bad. Not necessarily true for block-level dedup. Let's say we break a file into 8kb chunks, encrypt each chunk to the user's private key, then push those chunks to the network. The same file uploaded by different users would produce completely different block sets. Maybe each storage node maintains a hash table of the blocks it's storing. So when the client node pushes out a block, it queries the known storage nodes to see if someone is already holding a block with that hash. The block size might need to be <= 4kB for that to be effective.<br />
<br />
I realize that's a big departure from the existing tahoe architecture. Food for thought :)<br />
<div id="divSignature"></div></span>