[tahoe-lafs-trac-stream] [tahoe-lafs] #1374: "walk through" or guide for people who want to read some code
tahoe-lafs
trac at tahoe-lafs.org
Wed Mar 2 14:43:19 PST 2011
#1374: "walk through" or guide for people who want to read some code
-----------------------------+----------------------------------------------
Reporter: zooko | Owner: nobody
Type: enhancement | Status: new
Priority: major | Milestone: undecided
Component: unknown | Version: 1.8.2
Resolution: | Keywords: docs
Launchpad Bug: |
-----------------------------+----------------------------------------------
Comment (by warner):
Actually, I'd say that tahoe's *server* is quite small, and it is the
client
(i.e. "gateway") that represents the bulk of the code. We try to keep the
server dumb, and keep the smarts on the edges.
Also, assuming tarsnap's client is encrypting user data before sending it
to
the server, I think tahoe's encoding/encrypting code is the most direct
equivalent. Probably everything from {{{Filenode}}}s down to the
server-accessing code in {{{storage_client.py}}}. The webapi code has no
equivalent in tarsnap (since their only frontend, as I understand it, is a
'tar'-like command), hence the HTTP-client CLI scripts don't directly
correspond either.
I've tried a handful of times (once per presentation, really) to come up
with
a good sequence in which to teach how the tahoe system+codebase works. I
usually try a high-level overview followed by explaining the details from
the
inside out, aiming to present a simplified version before adding more
details. Something like the following:
* diagram of client, gateway, servers
* coarse dataflow of (file goes to gateway, gateway creates shares,
shares
go to servers), and back again
* then zoom in to show immutable encoding process: refine "creates
shares"
to reveal random-key generation, encryption, then erasure-coding, then
add
a flat hash, show filecap generation, then add segmentation and merkle
trees, then show CHK-key generation
* zoom out to show decoding process: hash checking, erasure-decoding,
decryption
* zoom out to show share placement, hash-permutation, server allocation
queries, download-time DYHB queries, pipelining/impatience
* zoom out to show corresponding server-side methods:
{{{remote_allocate_buckets}}}, {{{RIBucketWriter.write}}},
{{{remote_get_buckets}}}, {{{RIBucketReader.read}}}.
* zoom out to show internal {{{Filenode}}} objects, {{{.read()}}} method,
{{{NodeMaker}}} methods
* zoom out to show webapi frontend, describe URL syntax. Show FTP, SFTP
frontends next to it, but maybe defer description
* zoom out to show CLI command scripts
* zoom out to high-level client/gateway/servers diagram, add Introducer
* zoom in to show {{{IntroducerClient}}}, show how messages are
distributed,
how clients learn about servers, connect those {{{IServer}}} objects in
{{{storage_client.py}}} to the diagram of server-allocation code
* introduce mutable files, refine "create shares" process in stages like
with immutable files: start with SMDF, show signed roothash, then
introduce segmentation and MDMF,
* show server-side mutable methods: {{{testv-and-readv-and-writev}}}
* explain UCWE, recovery methods, {{{.modify}}} retry mechanism
* explain directories, encoding format, {{{DirectoryNode}}} objects and
methods, webapi syntax
* then figure out what's left: backupdb, leases, deep-check/verify/renew,
FTP/SFTP, admin node-creation/start/stop commands, stats-gatherer,
key-generator, upload-download history/status, web utils like
welcome/provisioning/reliability pages
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1374#comment:4>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list