[tahoe-dev] [p2p-hackers] convergent encryption reconsidered -- salting and key-strengthening

Wed Apr 2 18:14:06 PDT 2008

Folks,

It seems to me there are two orthogonal (but complementary) attacks
which are being conflated on this thread.  It's interesting to wonder
which architectures suffer from each of these attacks and which do not
while still using convergent-encryption-based storage:

The "user to storage-index correlation" attack:

A service node participates in a convergent-encryption storage system,
and correlates nodes to storage identifiers.  This does not reveal the
stored content, but uniquely identifies it.  (Service node means the
node receives queries for SIs, regardless of how that node processes
those queries.)

The "content presence" attack:

The attacker guesses some content, derives the storage index, then
either queries the system or passively records queries for the given
index.  This answers the question "Has someone stored this index (or
at least queried for it)?"  An essential feature of this attack is
that it is distributed across all users' data.  I believe this is key,
because when people think "dictionary attack" it may be easy to
overlook this practicality.

As an example of how these attacks are disjoint: Using only a Tahoe
client, I can perform an existence attack by trying to retrieve some
contents without first publishing them.  If my (modified) client
successfully retrieves the contents, I know only that someone else has
stored them, but not who.  I do not need to control a storage node to
execute this attack (in other words, my client never receives SI
queries).

Now, here's a thought experiment on feasibility:

Consider that a user backs up their entire hard drive, and that on
some users' file systems there exists a file containing only a set of
windows domain credentials, and that the domain is a FQDN.

Successfully executing the existence attack gives you all the info you
need to compromise that account without the need for the user-to-SI
attack.

Furthermore, if we assume the file in question is common on windows
machines, then by the nature of the existence attack, we can rapidly
collect creds for many different users of the storage grid, all
without running a storage node.

In essence this is a brute force for credentials, but the target is
all users of the grid, rather than single domains at a time.

I'm inclined to be paranoid, so I see this thought experiment as both
feasible and serious.