<html><body><div align="left"  >2 I like 2</div><br><div align="left"  ><sup>Jason Johnson</sup><br></div><div align="left"  ><b><sup>ProjectSeven.us</b></sup><br></div><div align="left"  ><b><sup>NetGreen.us</b></sup></div><i><br><br>On 13/11/13 01:09:18,<br>Zooko O'Whielacronx <zookog@gmail.com> wrote:<br></i><div dir="ltr"><div class="">

      <p>Folks:</p><p>(This is a copy of <a href="https://tahoe-lafs.org/trac/tahoe-lafs/ticket/795#comment:13">https://tahoe-lafs.org/trac/tahoe-lafs/ticket/795#comment:13</a> .)<br></p><p>

Here's my rendition of our discussion of add-only sets at the Tahoe-LAFS

 Summit today. (As usual, I altered and embellished this story 

significantly while writing it down, and people who were present to 

witness the original discussion are invited to chime in.)

</p>

<p>

An add-only cap doesn't have to also be a write-only cap. It might be 

good for some use cases that you can give someone a cap that lets them 

read the whole set, and add elements into the set, without letting them 

remove elements or change previously-added elements. It might be good in

 some <em>other</em> use cases to have an "add-only&write-only" cap,

 which allows you to add elements into the set but doesn't let you read 

elements of the set, nor remove nor change previously-added elements. We

 agreed to focus on the former case for now, because it is easier to 

design and implement a solution to it. (See <a class="" href="https://tahoe-lafs.org/trac/tahoe-lafs/ticket/796" title="enhancement: write-only caps (new)">#796</a> for discussion of write-only caps.)

</p>

<p>

We agreed to forget about erasure-coding, which makes an 

already-confusing problem (how to implement add-only sets without 

allowing a few malicious servers, adders, or set-repairers to perform rollback attack or selection attack), into a very-confusing problem that exceeded my brain's ability to grapple with it.

</p>

<p>

So, for now, assume that add-only sets don't use erasure-coding at all.

</p>

<p>

Now, the basic design we came up with is like this. I'll explain it in multiple passes of successive refinement of the design.

</p>

<h3 id="FIRSTPASS:DESIGN0">FIRST PASS: DESIGN "0"</h3>

<p>

An authorized adder (someone who holds an add-cap) can generate 

"elements", which are bytestrings that can be added into the set. (I 

mispronounced "elements" as "elephants" at one point, and from that 

point forward the design was expressed in terms of a circus act 

involving elephants.)

</p>

<p>

Elephants have an identity as well as a value (bytestring), so:

</p>

<pre class="">    aos = DistributedSecureAddOnlySet()

    aos.add_elephant(b"\xFF"*100)

    aos.add_elephant(b"\xFF"*100)

</pre><p>

results in <tt>aos</tt> containing two elephants, not one, even though each elephant has the same value (the bytestring with one hundred 0xFF bytes in it).

</p>

<p>

<tt>aos.add_elephant()</tt> generates a random 256-bit nonce to make 

this elephant different from any other elephant with the same value. I 

call this "putting a tag on the elephant's ear" — a "tagged elephant" is

 a value plus a nonce. Even if two elephants are identical twins, they 

can be distinguished by the unique nonce written on their ear-tags.

</p>

<p>

<tt>aos.add_elephant()</tt> then puts a digital signature on the 

tagged-elephant (using the add-only-cap, which contains an Ed25519 

private key), and sends a copy of the tagged-elephant to every one of <tt>N</tt> different servers. Putting a digital signature on a tagged-elephant is called "wrapping a net around it".

</p>

<p>

A reader downloads all the tagged-elephants from all the servers, checks

 all the signatures, takes the union of the results, and returns the 

resulting set of elephants.

</p>

<p>

Design "A" relies on <em>at least one</em> of the servers that you reach

 to save you from rollback or selection attacks. Such a server does this

 by knowing, and honestly serving up to you, a fresh and complete set of

 tagged-elephants. “Rollback” is serving you a version of the set that 

existed at some previous time, so the honest server giving you a copy of

 the most recent set protects you from rollback attack. “Selection” is 

omitting some elephants from the set, so the honest server giving you a 

complete copy of the set protects you from selection attack.

</p>

<h3 id="SECONDPASS:DESIGN1">SECOND PASS: DESIGN "1"</h3>

<p>

We can extend Design "0" to make it harder for malicious servers to 

perform selection attacks on readers, even when the reader doesn't reach

 an honest server who has a complete copy of the most recent set.

</p>

<p>

The unnecessary vulnerability in Design "0" is that each tagged-elephant

 is signed independently of the other tagged-elephants, so malicious 

servers can deliver some tagged-elephants to a reader and withhold other

 tagged-elephants, and the reader will accept the resulting set, thus 

falling for a selection attack. To reduce this vulnerability, adders 

will sign all of the <em>current</em> tagged-elephants along with their 

new tagged-elephant with a single signature. More precisely, let the 

"identity" of a tagged-elephant be the secure hash of the 

tagged-elephant (i.e. the secure hash of the nonce concatenated with the

 value). The signature on a new tagged-elephant covers the identity of 

that tagged-elephant, concatenated with the identities of all extant 

tagged-elephants, under a single signature. In circus terms, you add the

 new tagged-elephant into a pile of tagged-elephants and throw a net 

over the entire pile, including the new tagged-elephant.

</p>

<p>

Now, malicious servers can't omit any of the older tagged-elephants 

without also omitting the new tagged-elephant. Readers will not accept 

the new tagged-elephant unless they also have a copy of all of the other

 tagged-elephants that were signed with the same signature. This limits 

the servers's options for selection attacks.

</p>

<h3 id="THIRDPASS:DESIGN2">THIRD PASS: DESIGN "2"</h3>

<p>

We can refine Design "1" to make it cleaner and more CPU-efficient and 

network-efficient. This will also lay the groundwork for an efficient 

network protocol.

</p>

<p>

The unnecessary "dirtiness" in Design "1" is that the digital signatures

 on older tagged-elephants become extraneous once you add a new digital 

signature. We have a mass of tagged-elephants, we throw a net over the 

whole mass, then later when we add a new tagged-elephant to the pile, we

 throw a new net on top of the new (slightly larger) pile. Now the <em>underlying</em>

 net has become redundant: once you've verified the signature of the 

outermost net, there is no need to check the signature of the inner net.

 In fact, if one implementation checks the signature of the inner net 

and another implementation does not check it, then a malicious adder 

colluding with a malicious server could cause the implementations to 

differ in their results, by putting an invalid net (an invalid 

signature) topped by a new tagged-elephant with a valid net. (Daira was 

the one who noticed that issue.)

</p>

<p>

To make this cleaner and more efficient, we will never put a net around a

 net, and instead we'll keep each tagged-elephant in a box. When you 

want to add a new tagged-elephant to a set, you rip off and throw away 

any extant nets, then you put the new tagged-elephant in a box which is nailed on top of the previous topmost box.

 Then you wrap a net around the new topmost box. "Nailing" box Y on top 

of box X means taking the secure hash of box X and appending that to box

 Y (before signing box Y). A "box" is a tagged-elephant concatenated 

with any number of "nails", each of which is the secure hash of a 

previous box.

</p>

<p>

(Note that you can sometimes have two or more boxes precariously perched

 at the top of a stack, when two adders have simultaneously added a box 

before each saw the other's new box. That's okay — the next time an 

adder adds a box on top of this stack, he'll nail his new box to <em>each</em> of the previous topmost boxes.)

</p>

<p>

Boxes are a lot more CPU-efficient than nets, and more importantly 

nobody (neither readers, adders, nor servers) needs to revisit a 

lower-down box in order to add a new top-most box. Once you nail box Y 

on top of box X, then you can later add box Z just by taking the hash of

 box Y, without revisiting box X.

</p>

<p>

Note that we need two different secure hashes here: one is the identity 

of a tagged-elephant, which is the secure hash of: the nonce 

concatenated with the value. The other is the hash of the box, which is 

the secure hash of: the identity of a tagged-elephant concatenated with 

the hashes of any previous boxes. We need the identity of a 

tagged-elephant for finding out whether a certain tagged-elephant 

already exists in a stack (regardless of what position it occupies 

within that stack), and we need the hash of the box for efficiently 

verifying that all the tagged-elephants in a stack were approved by an 

authorized adder.

</p>

<p>

This also leads to the efficient network protocol: an adder can remember

 (cache) the Directed Acyclic Graph of boxes which a given server 

previously told the adder about. When the adder wants to add a new 

tagged-elephant or a set of new tagged-elephants to that server, he can 

send just the boxes which would be <em>new</em> to that server, assuming

 that the server hasn't learned anything new since the last time they 

talked. Readers can do likewise, remembering what each server previously

 told them about, and asking the server to just tell them about things 

that are not already covered the topmost box(es) that the reader already

 knows about.

</p>

<h3 id="CONCLUSION">CONCLUSION</h3>

<p>

Okay, that's it! I think Design "2" is a good one. It has good security 

against rollback or selection attacks by malicious servers (assuming 

some kind of whitelisting of servers! Which is ticket <a class="" href="https://tahoe-lafs.org/trac/tahoe-lafs/ticket/467" title="enhancement: allow the user to specify which servers a given gateway will use for ... (new)">#467</a> and is not yet implemented.) And, it doesn't go <em>too</em>

 far over the top in terms of complexity; it seems more intuitive to me 

than (my vague memories of) previous attempts to design add-only sets 

for LAFS.

</p>

<p>

(By the way, there are a few other possible <a class="" href="https://tahoe-lafs.org/trac/tahoe-lafs/query?status=%21closed&keywords=%7Erollback&order=priority">ways to strengthen defenses against rollback attack</a>, which we've previously considered in the context of mutable files, but they probably also apply to add-only sets.)

</p>

<p>

I'm excited about this design being feasible, because I think add-only 

sets could be a critical building block in valuable use-cases such as 

secure logging, secure email, secure backup, and more. <br></p><p>Regards,</p><p>Zooko</p><p>P.S. Thanks to Isis and Mike for showing up today and, when asked what Tahoe-LAFS improvements they were interested in, suggesting add-only sets.<br>

</p>

    </div></div>

</body></html>