the new 2014 Add-Only Sets

Wed Nov 13 18:39:10 UTC 2013

2 I like 2

Jason Johnson
ProjectSeven.us
NetGreen.us

On 13/11/13 01:09:18,
Zooko O'Whielacronx <zookog at gmail.com> wrote:
>Folks:
>
>(This is a copy of
>https://tahoe-lafs.org/trac/tahoe-lafs/ticket/795#comment:13 .)
>
>Here's my rendition of our discussion of add-only sets at the Tahoe-LAFS
>Summit today. (As usual, I altered and embellished this story significantly
>while writing it down, and people who were present to witness the original
>discussion are invited to chime in.)
>
>An add-only cap doesn't have to also be a write-only cap. It might be good
>for some use cases that you can give someone a cap that lets them read the
>whole set, and add elements into the set, without letting them remove
>elements or change previously-added elements. It might be good in some
>*other* use cases to have an "add-only&write-only" cap, which allows you to
>add elements into the set but doesn't let you read elements of the set, nor
>remove nor change previously-added elements. We agreed to focus on the
>former case for now, because it is easier to design and implement a
>solution to it. (See
>#796<https://tahoe-lafs.org/trac/tahoe-lafs/ticket/796>for discussion
>of write-only caps.)
>
>We agreed to forget about erasure-coding, which makes an already-confusing
>problem (how to implement add-only sets without allowing a few malicious
>servers, adders, or set-repairers to perform *rollback attack* or *selection
>attack*), into a *very*-confusing problem that exceeded my brain's ability
>to grapple with it.
>
>So, for now, assume that add-only sets don't use erasure-coding at all.
>
>Now, the basic design we came up with is like this. I'll explain it in
>multiple passes of successive refinement of the design.
>FIRST PASS: DESIGN "0"
>
>An authorized adder (someone who holds an add-cap) can generate "elements",
>which are bytestrings that can be added into the set. (I mispronounced
>"elements" as "elephants" at one point, and from that point forward the
>design was expressed in terms of a circus act involving elephants.)
>
>Elephants have an identity as well as a value (bytestring), so:
>
>    aos = DistributedSecureAddOnlySet()
>    aos.add_elephant(b"\xFF"*100)
>    aos.add_elephant(b"\xFF"*100)
>
>results in aos containing *two* elephants, not one, even though each
>elephant has the same value (the bytestring with one hundred 0xFF bytes in
>it).
>
>aos.add_elephant() generates a random 256-bit nonce to make this elephant
>different from any other elephant with the same value. I call this "putting
>a tag on the elephant's ear" — a "tagged elephant" is a value plus a nonce.
>Even if two elephants are identical twins, they can be distinguished by the
>unique nonce written on their ear-tags.
>
>aos.add_elephant() then puts a digital signature on the tagged-elephant
>(using the add-only-cap, which contains an Ed25519 private key), and sends
>a copy of the tagged-elephant to every one of N different servers. Putting
>a digital signature on a tagged-elephant is called "wrapping a net around
>it".
>
>A reader downloads all the tagged-elephants from all the servers, checks
>all the signatures, takes the union of the results, and returns the
>resulting set of elephants.
>
>Design "A" relies on *at least one* of the servers that you reach to save
>you from rollback or selection attacks. Such a server does this by knowing,
>and honestly serving up to you, a fresh and complete set of
>tagged-elephants. “Rollback” is serving you a version of the set that
>existed at some previous time, so the honest server giving you a copy of
>the most recent set protects you from rollback attack. “Selection” is
>omitting some elephants from the set, so the honest server giving you a
>complete copy of the set protects you from selection attack.
>SECOND PASS: DESIGN "1"
>
>We can extend Design "0" to make it harder for malicious servers to perform
>selection attacks on readers, even when the reader doesn't reach an honest
>server who has a complete copy of the most recent set.
>
>The unnecessary vulnerability in Design "0" is that each tagged-elephant is
>signed independently of the other tagged-elephants, so malicious servers
>can deliver some tagged-elephants to a reader and withhold other
>tagged-elephants, and the reader will accept the resulting set, thus
>falling for a selection attack. To reduce this vulnerability, adders will
>sign all of the *current* tagged-elephants along with their new
>tagged-elephant with a single signature. More precisely, let the "identity"
>of a tagged-elephant be the secure hash of the tagged-elephant (i.e. the
>secure hash of the nonce concatenated with the value). The signature on a
>new tagged-elephant covers the identity of that tagged-elephant,
>concatenated with the identities of all extant tagged-elephants, under a
>single signature. In circus terms, you add the new tagged-elephant into a
>pile of tagged-elephants and throw a net over the entire pile, including
>the new tagged-elephant.
>
>Now, malicious servers can't omit any of the older tagged-elephants without
>also omitting the new tagged-elephant. Readers will not accept the new
>tagged-elephant unless they also have a copy of all of the other
>tagged-elephants that were signed with the same signature. This limits the
>servers's options for selection attacks.
>THIRD PASS: DESIGN "2"
>
>We can refine Design "1" to make it cleaner and more CPU-efficient and
>network-efficient. This will also lay the groundwork for an efficient
>network protocol.
>
>The unnecessary "dirtiness" in Design "1" is that the digital signatures on
>older tagged-elephants become extraneous once you add a new digital
>signature. We have a mass of tagged-elephants, we throw a net over the
>whole mass, then later when we add a new tagged-elephant to the pile, we
>throw a new net on top of the new (slightly larger) pile. Now the
>*underlying* net has become redundant: once you've verified the signature
>of the outermost net, there is no need to check the signature of the inner
>net. In fact, if one implementation checks the signature of the inner net
>and another implementation does not check it, then a malicious adder
>colluding with a malicious server could cause the implementations to differ
>in their results, by putting an invalid net (an invalid signature) topped
>by a new tagged-elephant with a valid net. (Daira was the one who noticed
>that issue.)
>
>To make this cleaner and more efficient, we will never put a net around a
>net, and instead we'll keep each tagged-elephant in a box. When you want to
>add a new tagged-elephant to a set, you rip off and throw away any extant
>nets, then you *put the new tagged-elephant in a box which is nailed on top
>of the previous topmost box*. Then you wrap a net around the new topmost
>box. "Nailing" box Y on top of box X means taking the secure hash of box X
>and appending that to box Y (before signing box Y). A "box" is a
>tagged-elephant concatenated with any number of "nails", each of which is
>the secure hash of a previous box.
>
>(Note that you can sometimes have two or more boxes precariously perched at
>the top of a stack, when two adders have simultaneously added a box before
>each saw the other's new box. That's okay — the next time an adder adds a
>box on top of this stack, he'll nail his new box to *each* of the previous
>topmost boxes.)
>
>Boxes are a lot more CPU-efficient than nets, and more importantly nobody
>(neither readers, adders, nor servers) needs to revisit a lower-down box in
>order to add a new top-most box. Once you nail box Y on top of box X, then
>you can later add box Z just by taking the hash of box Y, without
>revisiting box X.
>
>Note that we need two different secure hashes here: one is the identity of
>a tagged-elephant, which is the secure hash of: the nonce concatenated with
>the value. The other is the hash of the box, which is the secure hash of:
>the identity of a tagged-elephant concatenated with the hashes of any
>previous boxes. We need the identity of a tagged-elephant for finding out
>whether a certain tagged-elephant already exists in a stack (regardless of
>what position it occupies within that stack), and we need the hash of the
>box for efficiently verifying that all the tagged-elephants in a stack were
>approved by an authorized adder.
>
>This also leads to the efficient network protocol: an adder can remember
>(cache) the Directed Acyclic Graph of boxes which a given server previously
>told the adder about. When the adder wants to add a new tagged-elephant or
>a set of new tagged-elephants to that server, he can send just the boxes
>which would be *new* to that server, assuming that the server hasn't
>learned anything new since the last time they talked. Readers can do
>likewise, remembering what each server previously told them about, and
>asking the server to just tell them about things that are not already
>covered the topmost box(es) that the reader already knows about.
>CONCLUSION
>
>Okay, that's it! I think Design "2" is a good one. It has good security
>against rollback or selection attacks by malicious servers (assuming some
>kind of whitelisting of servers! Which is ticket
>#467<https://tahoe-lafs.org/trac/tahoe-lafs/ticket/467>and is not yet
>implemented.) And, it doesn't go
>*too* far over the top in terms of complexity; it seems more intuitive to
>me than (my vague memories of) previous attempts to design add-only sets
>for LAFS.
>
>(By the way, there are a few other possible ways to strengthen defenses
>against rollback
>attack<https://tahoe-lafs.org/trac/tahoe-lafs/query?status=%21closed&keywords=%7Erollback&order=priority>,
>which we've previously considered in the context of mutable files, but they
>probably also apply to add-only sets.)
>
>I'm excited about this design being feasible, because I think add-only sets
>could be a critical building block in valuable use-cases such as secure
>logging, secure email, secure backup, and more.
>
>Regards,
>
>Zooko
>
>P.S. Thanks to Isis and Mike for showing up today and, when asked what
>Tahoe-LAFS improvements they were interested in, suggesting add-only sets.
>
>_______________________________________________
>tahoe-dev mailing list
>tahoe-dev at tahoe-lafs.org
>https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20131113/2d0022ec/attachment.html>