[tahoe-lafs-trac-stream] [tahoe-lafs] #678: converge same file, same K, different N (was: converge same file, same K, different M)

tahoe-lafs trac at tahoe-lafs.org
Sat Jan 29 21:10:18 UTC 2011


#678: converge same file, same K, different N
-------------------------------+--------------------------------------------
     Reporter:  zooko          |       Owner:                                      
         Type:  enhancement    |      Status:  new                                 
     Priority:  major          |   Milestone:  undecided                           
    Component:  code-encoding  |     Version:  1.3.0                               
   Resolution:                 |    Keywords:  newcaps space-efficiency performance
Launchpad Bug:                 |  
-------------------------------+--------------------------------------------

Old description:

> The underlying erasure code is "systematic", which means the first
> {{{K}}} shares are simply the segments of the file (except that the last
> one, which contains the end of the file, is padded out to be the same
> size as the others).  The erasure code also has the property (I don't
> know what it is called) that the "check shares" or "secondary shares" --
> the ones after the first {{{K}}} -- are also the same regardless of what
> {{{M}}} is.
>
> Therefore if you upload a file with, e.g. {{{K=13}}}, {{{M=16}}} and then
> you __re-upload__ it with {{{K=13}}}, {{{M=26}}}, then the index 0
> through the index 15 share that you upload would be exactly the same as
> they were in the original upload (if you used the same encryption key to
> encrypt the file before erasure-coding it).
>
> However, Tahoe currently doesn't take advantage of this coincidence at
> all, because it doesn't use the same encryption key for those two files.
> Instead it includes {{{M}}} in the generation of the encryption key, and
> in the resulting immutable-file read-cap, so that the two uploads of the
> same file with the same {{{K}}} and different {{{M}}} result in
> completely different sets of shares and different read-caps.  specs:
> [source:docs/specifications/file-encoding.txt at 20090222054054-4233b-
> ce16f0d882f804485e792782c1a9527db25114d0 file-encoding.txt]; source code:
> [source:src/allmydata/immutable/upload.py at 20090304013715-4233b-
> fbbf44eba801fb41795d4421bbcb7a028d45e6ff#L1116 upload.py]
>
> To fix this ticket, figure out how to retain all of the safety and
> security properties that Tahoe immutable-file currently have, while also
> letting those two uploads share their first 16 shares.
>
> (By the way, the reason I was reminded of this is that I'm currently
> doing an upload exactly like this: [http://allmydata.org/pipermail/tahoe-
> dev/2009-April/001554.html dogfood tasting report] . :-))

New description:

 The underlying erasure code is "systematic", which means the first {{{K}}}
 shares are simply the segments of the file (except that the last one,
 which contains the end of the file, is padded out to be the same size as
 the others).  The erasure code also has the property (I don't know what it
 is called) that the "check shares" or "secondary shares" -- the ones after
 the first {{{K}}} -- are also the same regardless of what {{{N}}} is.

 Therefore if you upload a file with, e.g. {{{K=13}}}, {{{N=16}}} and then
 you __re-upload__ it with {{{K=13}}}, {{{N=26}}}, then the index 0 through
 the index 15 share that you upload would be exactly the same as they were
 in the original upload (if you used the same encryption key to encrypt the
 file before erasure-coding it).

 However, Tahoe currently doesn't take advantage of this coincidence at
 all, because it doesn't use the same encryption key for those two files.
 Instead it includes {{{N}}} in the generation of the encryption key, and
 in the resulting immutable-file read-cap, so that the two uploads of the
 same file with the same {{{K}}} and different {{{N}}} result in completely
 different sets of shares and different read-caps.  specs:
 [source:docs/specifications/file-encoding.txt at 20090222054054-4233b-
 ce16f0d882f804485e792782c1a9527db25114d0 file-encoding.txt]; source code:
 [source:src/allmydata/immutable/upload.py at 20090304013715-4233b-
 fbbf44eba801fb41795d4421bbcb7a028d45e6ff#L1116 upload.py]

 To fix this ticket, figure out how to retain all of the safety and
 security properties that Tahoe immutable-file currently have, while also
 letting those two uploads share their first 16 shares.

 (By the way, the reason I was reminded of this is that I'm currently doing
 an upload exactly like this: [http://allmydata.org/pipermail/tahoe-
 dev/2009-April/001554.html dogfood tasting report] . :-))

--

Comment (by warner):

 s/M/N/ to match our existing "k-of-N" terminology

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/678#comment:8>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list