[tahoe-lafs-trac-stream] [Tahoe-LAFS] #1374: "walk through" or guide for people who want to read some code
Tahoe-LAFS
trac at tahoe-lafs.org
Thu Sep 11 22:21:23 UTC 2014
#1374: "walk through" or guide for people who want to read some code
-------------------------------+-----------------------
Reporter: zooko | Owner: nobody
Type: enhancement | Status: new
Priority: major | Milestone: undecided
Component: documentation | Version: 1.8.2
Resolution: | Keywords: docs
Launchpad Bug: |
-------------------------------+-----------------------
Changes (by warner):
* component: unknown => documentation
Old description:
> Riastradh writes on IRC: "in an afternoon I was able to understand
> basically everything Tarsnap does, and had time to spare to take a closer
> look at some details"
>
> In doing so, he found the major bug in tarsnap's encryption which exposed
> all plaintext of all users until his bug report led to tarsnap fixing it.
>
> So: I would like it if people like Riastradh were more likely to read the
> source code of Tahoe-LAFS on an afternoon!
>
> I asked how many lines of code in tarsnap, and he replied "The core is
> twenty thousand lines of heavily commented C (probably half the lines are
> comments). A good chunk of that, about a thousand lines, is just a
> modified front end to bsdtar."
>
> Note that tarsnap distributes the source code only for the client (the
> source code of the server is secret).
>
> I ran the following command to get a rough estimate of the number of
> lines of code you would want to read to get a good idea of the behavior
> of the Tahoe-LAFS client:
>
> {{{
> find client.py codec.py control.py dirnode.py hashtree.py immutable/
> interfaces.py mutable/ node.py nodemaker.py storage_client.py uri.py
> scripts -type f -print0 | xargs -0 wc -l
> }}}
> Result:
> {{{
> 21767 total
> }}}
> {{{
> find client.py codec.py control.py dirnode.py hashtree.py immutable/
> interfaces.py mutable/ node.py nodemaker.py storage_client.py uri.py
> scripts -print0 | xargs -0 sloccount
> }}}
> Result:
> {{{
> SLOC Directory SLOC-by-Language (Sorted)
> 3906 scripts python=3906
> 3199 immutable python=3199
> 2885 top_dir python=2885
> 2453 mutable python=2453
> 1446 downloader python=1446
>
> Totals grouped by language (dominant language first):
> python: 13889 (100.00%)
> }}}
>
> So it is approximately the same amount of code and comments as tarsnap!
> (Although perhaps you would want to include or exclude different files or
> directories in your reading of the client.) Of course, Tahoe-LAFS is
> written in Python and tarsnap is written in C.
>
> This ticket is to create a short document which describes for a newcomer
> how to find the source code which implements just the client, and then
> the source code which implements just the server, and what the different
> parts of the client source code are for, etc.
>
> Alternately, instead of making a separate document for this, take extant
> documents and add hyperlinks into them pointing to the relevant source
> code. Documents which might benefit from pointers to source code:
>
> * [source:trunk/docs/architecture.rst architecture.rst]
> * [source:trunk/docs/frontends/CLI.rst frontends/CLI.rst]
> * [source:trunk/docs/frontends/FTP-and-SFTP.rst frontends/FTP-and-
> SFTP.rst]
> * [source:trunk/docs/frontends/webapi.rst frontends/webapi.rst]
> * [source:trunk/docs/specifications/dirnodes.rst
> specifications/dirnodes.rst]
> * [source:trunk/docs/specifications/mutable.rst
> specifications/mutable.rst]
> * [source:trunk/docs/specifications/servers-of-happiness.rst
> specifications/servers-of-happiness.rst]
> * [source:trunk/docs/specifications/URI-extension.rst specifications/URI-
> extension.rst]
> * [source:trunk/docs/specifications/uri.rst specifications/uri.rst]
> * [source:trunk/docs/backupdb.rst backupdb.rst]
> * [source:trunk/docs/garbage-collection.rst garbage-collection.rst]
> * [source:trunk/docs/helper.rst helper.rst]
>
> Hm, and at the same time, the source files which are pointed to from
> these doc files should have links in them pointing back to these docs
> files.
New description:
Riastradh writes on IRC: "in an afternoon I was able to understand
basically everything Tarsnap does, and had time to spare to take a closer
look at some details"
In doing so, he found the major bug in tarsnap's encryption which exposed
all plaintext of all users until his bug report led to tarsnap fixing it.
So: I would like it if people like Riastradh were more likely to read the
source code of Tahoe-LAFS on an afternoon!
I asked how many lines of code in tarsnap, and he replied "The core is
twenty thousand lines of heavily commented C (probably half the lines are
comments). A good chunk of that, about a thousand lines, is just a
modified front end to bsdtar."
Note that tarsnap distributes the source code only for the client (the
source code of the server is secret).
I ran the following command to get a rough estimate of the number of lines
of code you would want to read to get a good idea of the behavior of the
Tahoe-LAFS client:
{{{
find client.py codec.py control.py dirnode.py hashtree.py immutable/
interfaces.py mutable/ node.py nodemaker.py storage_client.py uri.py
scripts -type f -print0 | xargs -0 wc -l
}}}
Result:
{{{
21767 total
}}}
{{{
find client.py codec.py control.py dirnode.py hashtree.py immutable/
interfaces.py mutable/ node.py nodemaker.py storage_client.py uri.py
scripts -print0 | xargs -0 sloccount
}}}
Result:
{{{
SLOC Directory SLOC-by-Language (Sorted)
3906 scripts python=3906
3199 immutable python=3199
2885 top_dir python=2885
2453 mutable python=2453
1446 downloader python=1446
Totals grouped by language (dominant language first):
python: 13889 (100.00%)
}}}
So it is approximately the same amount of code and comments as tarsnap!
(Although perhaps you would want to include or exclude different files or
directories in your reading of the client.) Of course, Tahoe-LAFS is
written in Python and tarsnap is written in C.
This ticket is to create a short document which describes for a newcomer
how to find the source code which implements just the client, and then the
source code which implements just the server, and what the different parts
of the client source code are for, etc.
Alternately, instead of making a separate document for this, take extant
documents and add hyperlinks into them pointing to the relevant source
code. Documents which might benefit from pointers to source code:
* [source:trunk/docs/architecture.rst architecture.rst]
* [source:trunk/docs/frontends/CLI.rst frontends/CLI.rst]
* [source:trunk/docs/frontends/FTP-and-SFTP.rst frontends/FTP-and-
SFTP.rst]
* [source:trunk/docs/frontends/webapi.rst frontends/webapi.rst]
* [source:trunk/docs/specifications/dirnodes.rst
specifications/dirnodes.rst]
* [source:trunk/docs/specifications/mutable.rst
specifications/mutable.rst]
* [source:trunk/docs/specifications/servers-of-happiness.rst
specifications/servers-of-happiness.rst]
* [source:trunk/docs/specifications/URI-extension.rst specifications/URI-
extension.rst]
* [source:trunk/docs/specifications/uri.rst specifications/uri.rst]
* [source:trunk/docs/backupdb.rst backupdb.rst]
* [source:trunk/docs/garbage-collection.rst garbage-collection.rst]
* [source:trunk/docs/helper.rst helper.rst]
Hm, and at the same time, the source files which are pointed to from these
doc files should have links in them pointing back to these docs files.
--
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1374#comment:8>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list