[tahoe-lafs-trac-stream] [Tahoe-LAFS] #1374: "walk through" or guide for people who want to read some code

Tahoe-LAFS trac at tahoe-lafs.org
Thu Sep 11 22:21:23 UTC 2014


#1374: "walk through" or guide for people who want to read some code
-------------------------------+-----------------------
     Reporter:  zooko          |      Owner:  nobody
         Type:  enhancement    |     Status:  new
     Priority:  major          |  Milestone:  undecided
    Component:  documentation  |    Version:  1.8.2
   Resolution:                 |   Keywords:  docs
Launchpad Bug:                 |
-------------------------------+-----------------------
Changes (by warner):

 * component:  unknown => documentation


Old description:

> Riastradh writes on IRC: "in an afternoon I was able to understand
> basically everything Tarsnap does, and had time to spare to take a closer
> look at some details"
>
> In doing so, he found the major bug in tarsnap's encryption which exposed
> all plaintext of all users until his bug report led to tarsnap fixing it.
>
> So: I would like it if people like Riastradh were more likely to read the
> source code of Tahoe-LAFS on an afternoon!
>
> I asked how many lines of code in tarsnap, and he replied "The core is
> twenty thousand lines of heavily commented C (probably half the lines are
> comments).  A good chunk of that, about a thousand lines, is just a
> modified front end to bsdtar."
>
> Note that tarsnap distributes the source code only for the client (the
> source code of the server is secret).
>
> I ran the following command to get a rough estimate of the number of
> lines of code you would want to read to get a good idea of the behavior
> of the Tahoe-LAFS client:
>
> {{{
> find client.py codec.py control.py dirnode.py hashtree.py immutable/
> interfaces.py mutable/ node.py nodemaker.py  storage_client.py uri.py
> scripts -type f -print0 | xargs -0 wc -l
> }}}
> Result:
> {{{
>    21767 total
> }}}
> {{{
> find client.py codec.py control.py dirnode.py hashtree.py immutable/
> interfaces.py mutable/ node.py nodemaker.py  storage_client.py uri.py
> scripts -print0 | xargs -0 sloccount
> }}}
> Result:
> {{{
> SLOC    Directory       SLOC-by-Language (Sorted)
> 3906    scripts         python=3906
> 3199    immutable       python=3199
> 2885    top_dir         python=2885
> 2453    mutable         python=2453
> 1446    downloader      python=1446
>

> Totals grouped by language (dominant language first):
> python:       13889 (100.00%)
> }}}
>
> So it is approximately the same amount of code and comments as tarsnap!
> (Although perhaps you would want to include or exclude different files or
> directories in your reading of the client.) Of course, Tahoe-LAFS is
> written in Python and tarsnap is written in C.
>
> This ticket is to create a short document which describes for a newcomer
> how to find the source code which implements just the client, and then
> the source code which implements just the server, and what the different
> parts of the client source code are for, etc.
>
> Alternately, instead of making a separate document for this, take extant
> documents and add hyperlinks into them pointing to the relevant source
> code. Documents which might benefit from pointers to source code:
>
> * [source:trunk/docs/architecture.rst architecture.rst]
> * [source:trunk/docs/frontends/CLI.rst frontends/CLI.rst]
> * [source:trunk/docs/frontends/FTP-and-SFTP.rst frontends/FTP-and-
> SFTP.rst]
> * [source:trunk/docs/frontends/webapi.rst frontends/webapi.rst]
> * [source:trunk/docs/specifications/dirnodes.rst
> specifications/dirnodes.rst]
> * [source:trunk/docs/specifications/mutable.rst
> specifications/mutable.rst]
> * [source:trunk/docs/specifications/servers-of-happiness.rst
> specifications/servers-of-happiness.rst]
> * [source:trunk/docs/specifications/URI-extension.rst specifications/URI-
> extension.rst]
> * [source:trunk/docs/specifications/uri.rst specifications/uri.rst]
> * [source:trunk/docs/backupdb.rst backupdb.rst]
> * [source:trunk/docs/garbage-collection.rst garbage-collection.rst]
> * [source:trunk/docs/helper.rst helper.rst]
>
> Hm, and at the same time, the source files which are pointed to from
> these doc files should have links in them pointing back to these docs
> files.

New description:

 Riastradh writes on IRC: "in an afternoon I was able to understand
 basically everything Tarsnap does, and had time to spare to take a closer
 look at some details"

 In doing so, he found the major bug in tarsnap's encryption which exposed
 all plaintext of all users until his bug report led to tarsnap fixing it.

 So: I would like it if people like Riastradh were more likely to read the
 source code of Tahoe-LAFS on an afternoon!

 I asked how many lines of code in tarsnap, and he replied "The core is
 twenty thousand lines of heavily commented C (probably half the lines are
 comments).  A good chunk of that, about a thousand lines, is just a
 modified front end to bsdtar."

 Note that tarsnap distributes the source code only for the client (the
 source code of the server is secret).

 I ran the following command to get a rough estimate of the number of lines
 of code you would want to read to get a good idea of the behavior of the
 Tahoe-LAFS client:

 {{{
 find client.py codec.py control.py dirnode.py hashtree.py immutable/
 interfaces.py mutable/ node.py nodemaker.py  storage_client.py uri.py
 scripts -type f -print0 | xargs -0 wc -l
 }}}
 Result:
 {{{
    21767 total
 }}}
 {{{
 find client.py codec.py control.py dirnode.py hashtree.py immutable/
 interfaces.py mutable/ node.py nodemaker.py  storage_client.py uri.py
 scripts -print0 | xargs -0 sloccount
 }}}
 Result:
 {{{
 SLOC    Directory       SLOC-by-Language (Sorted)
 3906    scripts         python=3906
 3199    immutable       python=3199
 2885    top_dir         python=2885
 2453    mutable         python=2453
 1446    downloader      python=1446


 Totals grouped by language (dominant language first):
 python:       13889 (100.00%)
 }}}

 So it is approximately the same amount of code and comments as tarsnap!
 (Although perhaps you would want to include or exclude different files or
 directories in your reading of the client.) Of course, Tahoe-LAFS is
 written in Python and tarsnap is written in C.

 This ticket is to create a short document which describes for a newcomer
 how to find the source code which implements just the client, and then the
 source code which implements just the server, and what the different parts
 of the client source code are for, etc.

 Alternately, instead of making a separate document for this, take extant
 documents and add hyperlinks into them pointing to the relevant source
 code. Documents which might benefit from pointers to source code:

 * [source:trunk/docs/architecture.rst architecture.rst]
 * [source:trunk/docs/frontends/CLI.rst frontends/CLI.rst]
 * [source:trunk/docs/frontends/FTP-and-SFTP.rst frontends/FTP-and-
 SFTP.rst]
 * [source:trunk/docs/frontends/webapi.rst frontends/webapi.rst]
 * [source:trunk/docs/specifications/dirnodes.rst
 specifications/dirnodes.rst]
 * [source:trunk/docs/specifications/mutable.rst
 specifications/mutable.rst]
 * [source:trunk/docs/specifications/servers-of-happiness.rst
 specifications/servers-of-happiness.rst]
 * [source:trunk/docs/specifications/URI-extension.rst specifications/URI-
 extension.rst]
 * [source:trunk/docs/specifications/uri.rst specifications/uri.rst]
 * [source:trunk/docs/backupdb.rst backupdb.rst]
 * [source:trunk/docs/garbage-collection.rst garbage-collection.rst]
 * [source:trunk/docs/helper.rst helper.rst]

 Hm, and at the same time, the source files which are pointed to from these
 doc files should have links in them pointing back to these docs files.

--

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1374#comment:8>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list