[tahoe-dev] [tahoe-lafs] #1109: let the get_buckets() response include the first block

tahoe-lafs trac at tahoe-lafs.org
Sun Jul 18 01:30:12 UTC 2010


#1109: let the get_buckets() response include the first block
------------------------------+---------------------------------------------
     Reporter:  zooko         |       Owner:  warner              
         Type:  enhancement   |      Status:  new                 
     Priority:  major         |   Milestone:  1.8.0               
    Component:  code-network  |     Version:  1.7.0               
   Resolution:                |    Keywords:  download performance
Launchpad Bug:                |  
------------------------------+---------------------------------------------

Comment (by warner):

 Yeah, in general, I think a more stateless immutable share-read interface
 would be better. The mutable share interface (which was written about 6
 months later) is stateless, and that makes life a bit easier. That
 interface takes a read vector and a set of share numbers, with an empty
 set meaning "all shares that you are holding", and I think the same
 interface would work for immutable files.

 Such an interface assumes that the server can efficiently open+seek+read
 the same file several times in quick succession (closing the filehandle
 between each call), whereas the current stateful interface keeps the
 filehandle open for the entire download. I suspect that most modern
 OS/filesystems cache recently-opened files and make this fairly quick, but
 it's worth doing some benchmarks to be certain. Also, we'd want to
 consider the interaction with GC and/or external tools which delete
 shares: keeping the filehandle open means a download will survive the
 share being deleted in the middle, whereas a stateless interface would
 not.

 A stateless interface would also make us slightly more resistant to a DoS
 attack in which the attacker opens lots of shares at once and tries to
 fill the file-descriptor table.

 I'd want to leave the signature of get_buckets() alone, and add a new
 method instead. The server-version-information dictionary could be used to
 advertise the availability of such a method.

 And overall, yeah, I'd like to optimize out that extra round trip, because
 my new downloader (#798) can currently retrieve a small file in just two
 roundtrips, and with this fix we could get that down to just a single
 roundtrip, which would be great.

 I was probably a bit over-enthusiastic about using Foolscap remote
 references when I wrote the immutable interface... incidentally, one
 conceivable benefit of the stateful interface could come up in server-
 driven share migration. That code (living in server A) could talk to
 server B and send it a "please copy my share" message, passing it the
 remote-reference to the share's {{{BucketReader}}}, or a client could use
 Foolscap's third-party-reference ("Gifts") feature to let A and B move the
 data directly between themselves without requiring client-side bandwidth
 for the copy. Of course, since all shares are publically readable, there's
 no authority-reducing benefit to doing it with bucket references over
 simply telling someone the storage-index and having them do the reads
 themselves. But when Accounting shows up, it might wind up to be handy to
 have a bucket-plus-read-authority object available to pass around.

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1109#comment:1>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-dev mailing list