[tahoe-lafs-trac-stream] [tahoe-lafs] #1374: "walk through" or guide for people who want to read some code

tahoe-lafs trac at tahoe-lafs.org
Wed Mar 2 14:43:19 PST 2011


#1374: "walk through" or guide for people who want to read some code
-----------------------------+----------------------------------------------
     Reporter:  zooko        |       Owner:  nobody   
         Type:  enhancement  |      Status:  new      
     Priority:  major        |   Milestone:  undecided
    Component:  unknown      |     Version:  1.8.2    
   Resolution:               |    Keywords:  docs     
Launchpad Bug:               |  
-----------------------------+----------------------------------------------

Comment (by warner):

 Actually, I'd say that tahoe's *server* is quite small, and it is the
 client
 (i.e. "gateway") that represents the bulk of the code. We try to keep the
 server dumb, and keep the smarts on the edges.

 Also, assuming tarsnap's client is encrypting user data before sending it
 to
 the server, I think tahoe's encoding/encrypting code is the most direct
 equivalent. Probably everything from {{{Filenode}}}s down to the
 server-accessing code in {{{storage_client.py}}}. The webapi code has no
 equivalent in tarsnap (since their only frontend, as I understand it, is a
 'tar'-like command), hence the HTTP-client CLI scripts don't directly
 correspond either.

 I've tried a handful of times (once per presentation, really) to come up
 with
 a good sequence in which to teach how the tahoe system+codebase works. I
 usually try a high-level overview followed by explaining the details from
 the
 inside out, aiming to present a simplified version before adding more
 details. Something like the following:

  * diagram of client, gateway, servers
  * coarse dataflow of (file goes to gateway, gateway creates shares,
 shares
    go to servers), and back again
  * then zoom in to show immutable encoding process: refine "creates
 shares"
    to reveal random-key generation, encryption, then erasure-coding, then
 add
    a flat hash, show filecap generation, then add segmentation and merkle
    trees, then show CHK-key generation
  * zoom out to show decoding process: hash checking, erasure-decoding,
    decryption
  * zoom out to show share placement, hash-permutation, server allocation
    queries, download-time DYHB queries, pipelining/impatience
  * zoom out to show corresponding server-side methods:
    {{{remote_allocate_buckets}}}, {{{RIBucketWriter.write}}},
    {{{remote_get_buckets}}}, {{{RIBucketReader.read}}}.
  * zoom out to show internal {{{Filenode}}} objects, {{{.read()}}} method,
    {{{NodeMaker}}} methods
  * zoom out to show webapi frontend, describe URL syntax. Show FTP, SFTP
    frontends next to it, but maybe defer description
  * zoom out to show CLI command scripts
  * zoom out to high-level client/gateway/servers diagram, add Introducer
  * zoom in to show {{{IntroducerClient}}}, show how messages are
 distributed,
    how clients learn about servers, connect those {{{IServer}}} objects in
    {{{storage_client.py}}} to the diagram of server-allocation code
  * introduce mutable files, refine "create shares" process in stages like
    with immutable files: start with SMDF, show signed roothash, then
    introduce segmentation and MDMF,
  * show server-side mutable methods: {{{testv-and-readv-and-writev}}}
  * explain UCWE, recovery methods, {{{.modify}}} retry mechanism
  * explain directories, encoding format, {{{DirectoryNode}}} objects and
    methods, webapi syntax
  * then figure out what's left: backupdb, leases, deep-check/verify/renew,
    FTP/SFTP, admin node-creation/start/stop commands, stats-gatherer,
    key-generator, upload-download history/status, web utils like
    welcome/provisioning/reliability pages

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1374#comment:4>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list