[tahoe-dev] [tahoe-lafs] #694: remove hard limit on mutable file size (was: tahoe put reeturns "error, got 413 request entity too large")
tahoe-lafs
trac at allmydata.org
Mon May 4 10:15:05 PDT 2009
#694: remove hard limit on mutable file size
------------------------+---------------------------------------------------
Reporter: sigmonsays | Owner: nobody
Type: defect | Status: new
Priority: major | Milestone: undecided
Component: unknown | Version: 1.4.1
Keywords: | Launchpad_bug:
------------------------+---------------------------------------------------
Comment(by zooko):
Thank you for the bug report, sigmonsays. There is a hardcoded limit on
the maximum size of mutable files. Directories are stored inside mutable
files. The directory into which you are linking your new file would grow
beyond the limit by the addition of this link.
I believe the first thing to do is to remove the hardcoded limit, and I'm
accordingly changing the title of this ticket to "remove hard limit on
mutable file size". The line of code in question is
[source:src/allmydata/mutable/publish.py at 20090222233056-4233b-
171d02bfd1df45fff4af7a4f64863755379e855a#L145 publish.py line 145].
Someone go fix it! Just remove the {{{MAX_SEGMENT_SIZE}}} hardcoded
parameter and all two places that it is used.
There is already a unit test in
[source:src/allmydata/test/test_mutable.py at 20090218222301-4233b-
49132283585996c7cee159d1ce2a9133bdd00aa7#L359 test_mutable.py line 359]
that makes sure that Tahoe raises a failure when you try to create a
mutable file that is bigger than 3,500,000 bytes. Change that test to
make sure that Tahoe ''doesn't'' raise a failure and instead that the file
is created.
After that, however, you might start to learn why we put that limit in --
it is because modifying a mutable file requires downloading and re-
uploading the entirety of that mutable file, and storing the entirety of
it in RAM while changing it. So the more links you keep in that directory
of yours, the slower it is going to be to read the directory or to change
it, and the more RAM will be used.
Ultimately we need to implement efficient modification of mutable files
without downloading and re-uploading the whole file -- that is the subject
of #393 (mutable: implement MDMF).
In the mean-time, there are also some tickets about optimizing the CPU
usage when processing large directories. Fixing these would not fix the
problem that the entire directory has to be downloaded and re-uploaded,
but these tickets might also be important: #327 (performance measurement
of directories), #329
(dirnodes could cache encrypted/serialized entries for speed), #383 (large
directories take a long time to modify), #414 (profiling on directory
unpacking).
--
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/694#comment:2>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid
More information about the tahoe-dev
mailing list