[tahoe-dev] filerock, SpiderOak, mega, "BitTorrent Share", Dropbox, dedupe

Zooko O'Whielacronx zookog at gmail.com
Mon Jan 28 17:43:12 UTC 2013


On Fri, Jan 25, 2013 at 12:51 AM, Uncle Zzzen <unclezzzen at gmail.com> wrote:
>
> I've been looking at https://www.filerock.com/ and although I have some reservations (server isn't open source, reasons to believe they collect statistics - e.g. web interface has google analytics, etc.) it's still interesting as something I could tell granny: "use this, it's pretty safe" (tried this with LAE and she's still recovering :) ), so any insight about them is welcome.

Wow, this raises a lot of responses in me:

1. Hey, cool, I haven't seen this company before. They advertise
client-side crypto and publish (even open source?) the client. That's
good competition for LeastAuthority.com!

2. Last I heard, the Tahoe-LAFS Software Foundation had google
analytics on https://tahoe-lafs.org. Was that taken down? If not, can
I see the resulting statistics?

3. I hope your granny recovers quickly. :-) I do *so* wish for a
Dropbox-like-GUI for Tahoe-LAFS! That's the only metaphor that really
works for people like your granny. Hm, I see that we don't even a trac
ticket for it! Added to my TODO list to create a trac ticket. (Heh.)

What I mean by "a Dropbox-like UI" is that there is a folder on a
computer and its contents get magically synced (in both/all
directions) with certain folders on other computers that are
"magically linked" to this one. So the way you use that UI is just put
your file in that magic folder. We have the first half of that
functionality in the form of the "drop-upload" feature
(https://tahoe-lafs.org/trac/tahoe-lafs/browser/git/docs/frontends/drop-upload.rst?rev=496b65bf0231e1f7f4ff97a430642068bcf7a6f9
), but of course we need the full functionality, plus easy
installation+configuration, before lots of people can use it.

4. Hm, this "filerock" company was founded by a bunch of Italian
academics who had worked on Authenticated Data Structures. Cool. They
have a patent on some authentication something-mumble.

5. Does anyone else know more about filerock? Pricing, performance,
someone looked at the client source code?

6. Of course, just to be clear, even though I approve of the invention
of more open-sourced, client-side encryption, I would not advise you
to rely on filerock for something very valuable or dangerous, just
yet. It is *very* hard to get this stuff right, and most new tools
have critical bugs in them. I haven't looked at the filerock source
code. (And in fact, perhaps I should not do so, as it might arguably
enable them to make copyright claims against anything I write after
that.)


> Anyway - I was reading the slides about "dedupable crypto" zooko has mentioned (don't remember where, can't find url now, but here's what I think is the paper), and my main concern is an attacker's ability to prove I'm storing known plaintext (censored, copyrighted, etc.). The estimate of what you save from this is 50% (just charge the customers twice, case closed). What you risk may be jail or worse :(
>
> Now filerock has a very trivial approach: there's a folder called "encrypted" and the rest isn't (and can be easily deduped).
>
> At the moment - everything in Tahoe-LAFS is encrypted (ain't complainin'). In future Tahoe-LAFS releases I'd rather see a choice per file between "encrypted (default)" and "plaintext (cheaper)" than having to use "dedupable crypto", exposing myself to censorship/copyright/etc. attacks.

Wait! Hold on a minute. In Tahoe-LAFS, by default, you have maximal
confidentiality about the contents of your files. There is no
deduplication, by default, between separate gateways, so none of the
leakages of confidentiality that you mention here are a threat. You
*could* configure your gateway to share deduplication scope with other
gateways, for example the other gateways in a friendgrid, and then you
become vulnerable to the people who control those gateways (but not to
anyone else) being able to do those sorts of confidentiality attacks
on you. I'll update the FAQ
(https://tahoe-lafs.org/trac/tahoe-lafs/wiki/FAQ#Q15_same_file_same_cap
) to attempt to clarify this.

Your UI suggestion sounds like an interesting alternative. One major
issue with Tahoe-LAFS deduplication is that (a) few people understand
it, and (b) few people know how to control it. Maybe having two
folders, one for "files which I want to share deduplication with
everyone in the world" and the other for "private files" is the
solution.

Or maybe there's no point in separating the concept of "keep the file
contents private but deduplicate the files with everyone in the world"
from the concept "publish the readcap to everyone in the world"! In
that case, the two folders should be "Public" and "Private". "Public"
has a convergence secret of the empty string, and all caps dropped
into Public get posted to some great collective aggregation of
readcaps in the sky somewhere. "Private" has a convergence secret
scoped to the uploading gateway (or, better yet, in the future, to the
virtual, Dropbox-like "magic folder"), and of course the readcaps are
not posted anywhere when you drop something into Private.

Feedback on metaphors and use-cases is welcome! Everyone please post
about how you use file-sharing, sync, backup, etc.


Oh, here's another competitor that offers client-side encryption:
SpiderOak. Please #include a similar littany of praise/encouragement
and caveats about SpiderOak as I posted above about filerock. (Except
I'm less concerned about SpiderOak suing me for copyright violation
since -- heh heh -- they use some of my GPL'ed source code (zfec) in
their proprietary servers.)

Anyway, they wrote an interesting blog entry a couple of years ago
about deduplication:

https://spideroak.com/blog/20100827150530-why-spideroak-doesnt-de-duplicate-data-across-users-and-why-it-should-worry-you-if-we-did

The question I have after reading that is if they are making the user
vulnerable to the SpiderOak server itself, or if they are protecting
the user from the SpiderOak server as well as from the other SpiderOak
users. Also, I wonder about the actual details about how their
protocol works (as mentioned above, such systems are often flawed, and
should be assumed to be vulnerable until shown otherwise), and I
wonder if their design or their policy has changed since they posted
that blog post.

Oh, and another similar thing that is a big deal on twitter right now
is the new service from Mega (formerly Mega Upload (sp?)). There are
quite a lot of issues about that, starting with the fact that the
entire client-side implementation is in Javascript that is served up
by the server on every page request and is thus trivially
back-doorable by the server (c.f. the HushMail incident cited in
lafs.pdf). And I don't know anything about deduplication in Mega. But,
it is important because it is widely discussed right now, and it is
likely to end up with a large number of users since Javascript is so
easy to deploy, and since they have a slick UI, and they have a
marketing department with a budget, and so on. So, while there are
obviously many bad things about Mega, there are at least two good
things: 1. It shows that there are some customers out there who care
about client-side encryption (even if they might not actually be
getting what they pay for, in the current version of Mega), and 2. It
shows that there are some customers who are okay with read-cap-in-URL!
Apparently Mega generates a URL for each file, and possession of that
URL is sufficient to give access to that file! Maybe they learned it
from us. I hope so.

And, apparently there is a new sharing service from BitTorrent
(founded by my old friend and collaborator/competitor Bram Cohen),
designed to compete with Dropbox, and backed by S3. I haven't seen any
details about it yet to let me know if it offers confidentiality
features.

Regards,

Zooko


More information about the tahoe-dev mailing list