[tahoe-dev] a crypto puzzle about digital signatures and future compatibility
Zooko Wilcox-O'Hearn
zooko at zooko.com
Thu Aug 27 09:02:42 PDT 2009
On Wednesday,2009-08-26, at 19:49 , Brian Warner wrote:
> Attack B is where Alice uploads a file, Bob gets the filecap and
> downloads it, Carol gets the same filecap and downloads it, and
> Carol desires to see the same file that Bob saw. ... The attackers
> (who may be Alice and/or other parties) get to craft the filecap
> and the shares however they like. The attackers win if Bob and
> Carol accept different documents.
Right, and if we add algorithm agility then this attack is possible
even if both SHA-2 and SHA-3 are perfectly secure!
Consider this variation of the scenario: Alice generates a filecap
and gives it to Bob. Bob uses it to fetch a file, reads the file and
sends the filecap to Carol along with a note saying that he approves
this file. Carol uses the filecap to fetch the file. The Bob-and-
Carol team loses if she gets a different file than the one he got.
Now suppose Alice is malicious and knows how to produce output which
appears to come from Tahoe-SHA2-SHA3, suppose Bob uses Tahoe-SHA3,
and suppose Carol uses Tahoe-SHA2. Then Alice can generate two
files, one which shows Bob what Alice wants him to see and the other
which shows Carol what Alice wants her to see.
So by adding an optional new hash algorithm intended to strengthen
Tahoe-LAFS against the possibility that someone can break SHA2, we
might (if we're not careful) open up a hole that can be exploited
even by someone who can't break SHA2.
One defense against the attack above would be to make sure that, as
long as you might want to share files with someone who might still
use Tahoe-SHA2, then you don't upgrade to Tahoe-SHA3 -- instead you
have to stick with the intermediate bi-lingual version, Tahoe-SHA2-
SHA3, which produces both kinds of hashes and checks both kinds of
hashes. But how can you tell whether there are still some Tahoe-SHA2
users out there somewhere that you might eventually want to share a
file with? Also, might this approach somehow accidentally prolong
Tahoe-LAFS's vulnerability to a flaw in SHA2?
So to use this defense, Bob would use Tahoe-SHA2-SHA3, and he would
always verify both hashes before approving the file. If one hash
matched but the other didn't, his Tahoe-LAFS software would warn him
that something is very wrong with Alice or her Tahoe-LAFS software.
(This means that we have to spend the CPU cycles verifying old-
fashioned hashes, and worse that we have to make file capabilities
twice as big in order to hold both hashes, which could negatively
impact the user experience.)
Another, complementary, defense against this sort of attack would be
that if you receive a filecap which has a hash in it that you don't
know how to check, then you should *erase* that hash from the filecap
before you pass that filecap on to your friend. Then if Alice has a
malicious Tahoe-SHA2-SHA3, Bob has Tahoe-SHA2, and Carol has Tahoe-
SHA3, Bob will give Carol a filecap with only a SHA2 hash in it,
which Carol will not know how to check, thus defeating Alice's evil
scheme.
The bottom line is that the whole idea of adding algorithm agility
and an optional hash algorithm seems to entail complication and
danger, and Tahoe-LAFS is very likely going to take the alternate
route: a future version of Tahoe-LAFS will probably define a
completely different type of immutable file capability which is
syntactically distinct from the current type (i.e. it starts with a
distinct leading character or it is a different length so that it
cannot be confused with the old kind by a program and hopefully not
by a human either), and which uses only SHA3. Then you will not be
able to produce a single filecap which can be verified with both SHA2
and SHA3. You can, of course, produce two different filecaps, one in
the old format and one in the new format. This sounds good to me
because if Alice sends a pair of filecaps to Bob then it will be
obvious to Bob that the two could point to different files, at
Alice's disgression.
> I always get confused about the difference between first-preimage
> and second-preimage, but I think there's a correspondence here. In
> Attack A, the attacker doesn't get to choose the filecap (i.e. the
> hash of the message): they've got to create shares to match a
> specific pre-determined cap. In Attack B, Alice can craft an
> arbitrarily complex message, taking advantage of a known collision
> or whatever.
Pre-image is figuring out the input x that someone used to compute y
= H(x), when they give you only y. Second-pre-image is when someone
else chooses an x and tells you x and then you find a different x2 !=
x such that H(x) = H(x2). Collision is when you come up with any two
values, x and x2 != x such that H(x) = H(x2).
Tahoe-LAFS's semantics of immutable file caps is that the cap is an
*identifier* of the file, not just a digital signature or message
authentication code on the file, as demonstrated in the Alice->Bob-
>Carol scenario above. Therefore, Tahoe-LAFS requires collision
resistance from its secure hash algorithm and not just second-
preimage-resistance. It is too bad we can't make do with second-
preimage-resistance, because we have much greater confidence in the
second-preimage-resistance of our hash functions than in collision-
resistance. SHA1, for example, has second-preimage-resistance (as
far as we know) but not collision-resistance. (By the way, I believe
that git has the same semantics for its hashes that Tahoe-LAFS has
for its immutable file caps and that, contrary to Linus Torvalds ,
Perry Metzger, et al. that git users are vulnerable to exploitation
by collisions. I'll try to write up my reasoning at some point.)
Regards,
Zooko
More information about the tahoe-dev
mailing list