[tahoe-lafs-trac-stream] [tahoe-lafs] #1398: make docs/performance.rst more precise and accurate
tahoe-lafs
trac at tahoe-lafs.org
Sun May 8 05:29:46 PDT 2011
#1398: make docs/performance.rst more precise and accurate
-------------------------------------------------+-------------------------
Reporter: zooko | Owner:
Type: defect | somebody
Priority: major | Status: new
Component: documentation | Milestone:
Keywords: easy scalability docs performance | undecided
memory large | Version: 1.8.2
| Launchpad Bug:
-------------------------------------------------+-------------------------
In comment:13:ticket:1395, Brian wrote, concerning the [http://tahoe-
lafs.org/trac/tahoe-
lafs/browser/trunk/docs/performance.rst?rev=4910#performing-a-file-verify-
on-an-a-byte-file "Performing a file-verify on an A-byte file"]:
to be "N/K*S times a small multiple". I think the multiple is currently
about 2 or 3. During encryption, we hold both a plaintext share and a
ciphertext share in RAM at the same time (so 2*S), then we drop the
plaintext. During erasure-coding, we hold a whole S of ciphertext in
memory at the same time as the N/K*S shares, then we drop the ciphertext
before pushing. We also pipeline the sends a little bit, I think 10kB or
50kB per server, to get better utilization out of a non-zero-latency wire.
Also Python's memory-management strategy interacts weirdly. Dropping the
plaintext segment may not be enough: Python might not re-use that memory
space for anything else right away. Although I'd expect it to de-fragment
or coalesce free blocks before asking the OS for so much memory that it
crashed.
Although I wonder if Brian was thinking of repair rather than verify since
he talks about encrypting, which is not done in verify.
Subsequently I reviewed the document and I see a bunch of things I'm not
sure are right. (Note that I myself am mostly responsible for the current
state of this document.)
* [http://tahoe-lafs.org/trac/tahoe-
lafs/browser/trunk/docs/performance.rst?rev=4910#publishing-an-a-byte-
immutable-file "Publishing an A-byte immutable file" / "when the file is
already uploaded"]: {{{memory footprint: N/K*S}}}. Shouldn't that just be
{{{memory footprint: S}}}? All it does is read each {{{S}}}-byte segment
in turn and hash it. {{{K}}} and {{{N}}} shouldn't come into it. This is
probably just a cut-and-paste error of think-o error on my part
originally, so unless someone else knows of a better reason why I wrote
that then I'm going to change it to {{{memory footprint: S}}}.
* [http://tahoe-lafs.org/trac/tahoe-
lafs/browser/trunk/docs/performance.rst?rev=4910#publishing-an-a-byte-
immutable-file "Publishing an A-byte immutable file" / "when the file is
not already uploaded"]: if we're going to make the {{{memory footprint}}}
more precise as Brian suggests above then this one should be changed too.
Also {{{network: ~N + ~A}}} should actually be {{{network: N/K*~A}}},
right?
* [http://tahoe-lafs.org/trac/tahoe-
lafs/browser/trunk/docs/performance.rst?rev=4910#downloading-b-bytes-of-
an-a-byte-immutable-file "Downloading B bytes of an A-byte immutable
file"]: {{{cpu: ~A}}}. What? The CPU usage for downloading {{{B}}} bytes
of an {{{A}}}-byte immutable file is {{{~A}}}? I really hope it is
actually {{{~B}}} (plus an amount of CPU logarithmic in {{{A}}} for Merkle
Tree verification, but I'm not sure we should try to include that much
precision in this document). Unless someone tells me I'm wrong now and was
right then, I'm going to change this to {{{cpu: ~B}}}.
* [http://tahoe-lafs.org/trac/tahoe-
lafs/browser/trunk/docs/performance.rst?rev=4910#repairing-an-a-byte-file-
mutable-or-immutable "Repairing an A-byte file"] {{{network: variable; up
to around ~A}}}: surely that should say {{{network: variable; up to around
N/K*A}}}
--
Ticket URL: <http://tahoe-lafs.org/trac/tahoe-lafs/ticket/1398>
tahoe-lafs <http://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list