Opened at 2009-05-04T16:27:37Z
Closed at 2009-06-21T05:28:47Z
#694 closed defect (fixed)
remove hard limit on mutable file size
Reported by: | sigmonsays | Owned by: | kevan |
---|---|---|---|
Priority: | major | Milestone: | undecided |
Component: | unknown | Version: | 1.4.1 |
Keywords: | easy | Cc: | kevan |
Launchpad Bug: |
Description (last modified by zooko)
<sigmonsays> the tahoe put returns this after several seconds "error, got 413 Request Entity Too Large" [09:06] <sigmonsays> allmydata.interfaces.FileTooLargeError: SDMF is limited to one segment, and 3500419 > 3500000 [09:07] <sigmonsays> the file i'm trying to send is only 9k ---- [configuration]------ 3 node cluster each w/ 2G (about 75% full on each) ---- [the full exception]------ 12148151_1141322388.jpg % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 16 7234 16 1232 0 0 5401 0 0:00:01 --:--:-- 0:00:01 5401waiting for file data on stdin.. 100 7234 100 7234 0 0 22420 0 --:--:-- --:--:-- --:--:-- 63851 error, got 413 Request Entity Too Large Traceback (most recent call last): File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/Twisted-8.1.0-py2.4-linux-x86_64.egg/twisted/internet/defer.py", line 312, in _startRunCallbacks self._runCallbacks() File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/Twisted-8.1.0-py2.4-linux-x86_64.egg/twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/Twisted-8.1.0-py2.4-linux-x86_64.egg/twisted/internet/defer.py", line 289, in _continue self.unpause() File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/Twisted-8.1.0-py2.4-linux-x86_64.egg/twisted/internet/defer.py", line 285, in unpause self._runCallbacks() --- <exception caught here> --- File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/Twisted-8.1.0-py2.4-linux-x86_64.egg/twisted/internet/defer.py", line 328, in _runCallbacks self.result = callback(self.result, *args, **kw) File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/src/allmydata/mutable/filenode.py", line 403, in _apply return self._upload(new_contents, servermap) File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/src/allmydata/mutable/filenode.py", line 438, in _upload return p.publish(new_contents) File "/usr/local/src/tahoe-files/allmydata-tahoe-1.4.1/src/allmydata/mutable/publish.py", line 146, in publish raise FileTooLargeError("SDMF is limited to one segment, and " allmydata.interfaces.FileTooLargeError: SDMF is limited to one segment, and 3500419 > 3500000 12062355_1141263102.jpg % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 20 5998 20 1232 0 0 5319 0 0:00:01 --:--:-- 0:00:01 5319waiting for file data on stdin.. 100 5998 100 5998 0 0 19168 0 --:--:-- --:--:-- --:--:-- 58839 ./send-to-tahoe.sh: line 19: 20708 Broken pipe cat files.lst
Attachments (2)
Change History (9)
comment:1 Changed at 2009-05-04T16:57:16Z by zooko
- Description modified (diff)
comment:2 Changed at 2009-05-04T17:15:05Z by zooko
- Summary changed from tahoe put reeturns "error, got 413 request entity too large" to remove hard limit on mutable file size
Thank you for the bug report, sigmonsays. There is a hardcoded limit on the maximum size of mutable files. Directories are stored inside mutable files. The directory into which you are linking your new file would grow beyond the limit by the addition of this link.
I believe the first thing to do is to remove the hardcoded limit, and I'm accordingly changing the title of this ticket to "remove hard limit on mutable file size". The line of code in question is publish.py line 145. Someone go fix it! Just remove the MAX_SEGMENT_SIZE hardcoded parameter and all two places that it is used.
There is already a unit test in test_mutable.py line 359 that makes sure that Tahoe raises a failure when you try to create a mutable file that is bigger than 3,500,000 bytes. Change that test to make sure that Tahoe doesn't raise a failure and instead that the file is created.
After that, however, you might start to learn why we put that limit in -- it is because modifying a mutable file requires downloading and re-uploading the entirety of that mutable file, and storing the entirety of it in RAM while changing it. So the more links you keep in that directory of yours, the slower it is going to be to read the directory or to change it, and the more RAM will be used.
Ultimately we need to implement efficient modification of mutable files without downloading and re-uploading the whole file -- that is the subject of #393 (mutable: implement MDMF).
In the mean-time, there are also some tickets about optimizing the CPU usage when processing large directories. Fixing these would not fix the problem that the entire directory has to be downloaded and re-uploaded, but these tickets might also be important: #327 (performance measurement of directories), #329 (dirnodes could cache encrypted/serialized entries for speed), #383 (large directories take a long time to modify), #414 (profiling on directory unpacking).
comment:3 Changed at 2009-06-10T16:39:00Z by zooko
- Keywords easy added
Changed at 2009-06-20T02:53:41Z by kevan
comment:4 Changed at 2009-06-20T02:53:53Z by kevan
I tried my hand at fixing this, and am attaching a patch. Comments?
comment:5 Changed at 2009-06-20T03:52:43Z by zooko
- Owner changed from nobody to kevan
Looks good! Please change the doc:
# this used to be in Publish, but we removed it there. Some of the # tests in here still use it, though, so here it is.
to something like:
# this used to be in Publish, but we removed the limit. Some of # these tests test whether the new code correctly allows files # larger than the limit
You can use darcs unrecord to undo the effect of your previous darcs record and then use darcs record again to make a new patch and attach it to this ticket. By the way, please name it with a trailing '.txt' this time.
Changed at 2009-06-20T21:55:37Z by kevan
comment:7 Changed at 2009-06-21T05:28:47Z by zooko
- Resolution set to fixed
- Status changed from new to closed
Fixed by db939750a8831c1e and efcc45951d3544ee. Thanks, Kevan!
Reformatting for Trac wiki escape.