[tahoe-lafs-trac-stream] [tahoe-lafs] #1398: make docs/performance.rst more precise and accurate

Sun May 8 05:29:46 PDT 2011

#1398: make docs/performance.rst more precise and accurate
-------------------------------------------------+-------------------------
 Reporter:  zooko                                |          Owner:
     Type:  defect                               |  somebody
 Priority:  major                                |         Status:  new
Component:  documentation                        |      Milestone:
 Keywords:  easy scalability docs performance    |  undecided
  memory large                                   |        Version:  1.8.2
                                                 |  Launchpad Bug:
-------------------------------------------------+-------------------------
 In comment:13:ticket:1395, Brian wrote, concerning the [http://tahoe-
 lafs.org/trac/tahoe-
 lafs/browser/trunk/docs/performance.rst?rev=4910#performing-a-file-verify-
 on-an-a-byte-file "Performing a file-verify on an A-byte file"]:

  to be "N/K*S times a small multiple". I think the multiple is currently
 about 2 or 3. During encryption, we hold both a plaintext share and a
 ciphertext share in RAM at the same time (so 2*S), then we drop the
 plaintext. During erasure-coding, we hold a whole S of ciphertext in
 memory at the same time as the N/K*S shares, then we drop the ciphertext
 before pushing. We also pipeline the sends a little bit, I think 10kB or
 50kB per server, to get better utilization out of a non-zero-latency wire.

  Also Python's memory-management strategy interacts weirdly. Dropping the
 plaintext segment may not be enough: Python might not re-use that memory
 space for anything else right away. Although I'd expect it to de-fragment
 or coalesce free blocks before asking the OS for so much memory that it
 crashed.

 Although I wonder if Brian was thinking of repair rather than verify since
 he talks about encrypting, which is not done in verify.

 Subsequently I reviewed the document and I see a bunch of things I'm not
 sure are right. (Note that I myself am mostly responsible for the current
 state of this document.)

 * [http://tahoe-lafs.org/trac/tahoe-
 lafs/browser/trunk/docs/performance.rst?rev=4910#publishing-an-a-byte-
 immutable-file "Publishing an A-byte immutable file" / "when the file is
 already uploaded"]: {{{memory footprint: N/K*S}}}. Shouldn't that just be
 {{{memory footprint: S}}}? All it does is read each {{{S}}}-byte segment
 in turn and hash it. {{{K}}} and {{{N}}} shouldn't come into it. This is
 probably just a cut-and-paste error of think-o error on my part
 originally, so unless someone else knows of a better reason why I wrote
 that then I'm going to change it to {{{memory footprint: S}}}.

 * [http://tahoe-lafs.org/trac/tahoe-
 lafs/browser/trunk/docs/performance.rst?rev=4910#publishing-an-a-byte-
 immutable-file "Publishing an A-byte immutable file" / "when the file is
 not already uploaded"]: if we're going to make the {{{memory footprint}}}
 more precise as Brian suggests above then this one should be changed too.
 Also {{{network: ~N + ~A}}} should actually be {{{network: N/K*~A}}},
 right?

 * [http://tahoe-lafs.org/trac/tahoe-
 lafs/browser/trunk/docs/performance.rst?rev=4910#downloading-b-bytes-of-
 an-a-byte-immutable-file "Downloading B bytes of an A-byte immutable
 file"]: {{{cpu: ~A}}}. What? The CPU usage for downloading {{{B}}} bytes
 of an {{{A}}}-byte immutable file is {{{~A}}}? I really hope it is
 actually {{{~B}}} (plus an amount of CPU logarithmic in {{{A}}} for Merkle
 Tree verification, but I'm not sure we should try to include that much
 precision in this document). Unless someone tells me I'm wrong now and was
 right then, I'm going to change this to {{{cpu: ~B}}}.

 * [http://tahoe-lafs.org/trac/tahoe-
 lafs/browser/trunk/docs/performance.rst?rev=4910#repairing-an-a-byte-file-
 mutable-or-immutable "Repairing an A-byte file"] {{{network: variable; up
 to around ~A}}}: surely that should say {{{network: variable; up to around
 N/K*A}}}

-- 
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1398>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage