[tahoe-dev] File size problem
Jan-Benedict Glaw
jbglaw at lug-owl.de
Fri Nov 28 16:32:46 PST 2008
On Fri, 2008-11-28 14:09:02 -0800, Brian Warner <warner-tahoe at allmydata.com> wrote:
> > As the next step, I tried to upload a file (32472771 bytes) through
> > the web frontend, which resulted in two issues:
>
> Yeah, as Zooko pointed out, the Tahoe default is to upload everything
> as an immutable file, which can be as large as 12GB (and we're a few
> code changes away from raising that limit into the exabyte range).
>
> Mutable files exist mainly to support directories, so we haven't yet
> finished the coding necessary to support large mutable files. The
> current limit of 3.5MB is somewhat arbitrary, but it accomplishes a
> couple of useful goals (reasonable alacrity, easy enough to implement
> quickly). Some day we'll have larger mutable files, but since 3.5MB is
> enough for a directory with tens of thousands of entries, it hasn't
> been a high priority so far.
How do mutable and immutable files (for /regular/ files) actually
differ? When I tested with an immutable file, I'd create one, delete
it and create a new file with different content (but the same name).
(I noticed that `du -ms' on the node's directories didn't get smaller.
When deleting files, are their fragments actually deleted, or marked
to be deletable in some way?)
> > * When is `ps axflwww'ed the process' memory usage, I saw that the
> > python instance that ran the connection where I uploaded the 31MB
> > file grew beyond 300MB of VSZ.
>
> That sounds like a bug in the code that's rejecting the too-large
> mutable file. Which version were you running? (1.2.0 or current
> trunk?). If it was current trunk, I'll look more closely at the
I `apt-get install'ed the "allmydata-tahoe" package; right now, that's
version 1.2.0-r3241. Since I tested some three days ago, it might
have been a few revisions earlier than 3241.
> problem. We've had runaway processes happen before when some bit of
> error-handling code got confused inside a loop. In fact, I think I
> remember a ticket about this, so I suspect it's been fixed in trunk.
There's also a "tahoe-prod" package at version 3.0.0-r2758. Should I
re-do the testing with that one?
> > I better keep my brain away from
> > thinking about uploading a gigabyte sized file on a 32bit
> > system...
>
> Oh, GB-sized *immutable* files work just fine. In fact we have over 800
> files 1GB or larger on our production network right now, and 87 files
> in the 3GB-10GB range. It took their owners a long time to upload them,
> but as far as we can tell, the uploads succeeded.
I hoped to hear something like that. I couldn't really imagine that
any customer pushing backup data into a tahoe grid would place tar
scripts cutting the tarballs into 3MB pieces :->
> > And during create-client and create-introducer, an empty directory is
> > required. It would be nice to ignore "lost+found" while looking up
> > directory contents...
[...]
> So I'm inclined to continue to encourage users to have a dedicated
> directory for their Tahoe node (by having 'tahoe create-client' create
> a brand-new directory for them). It sounds like you have a dedicated
> partition for your Tahoe node.. that's great. Just use 'tahoe
> create-client /newpartition/tahoe', and let the node have its own
> dedicated directory as well.
I ended up doing something like that.
For what is the key-generator actually used? Doesn't come up right
after creating one:
----------------------------------------------------------------------------
spinne:/mnt/tahoe# tahoe create-key-generator -C foo
spinne:/mnt/tahoe# tahoe start -C foo
STARTING /mnt/tahoe/foo
Traceback (most recent call last):
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line 614, in run
runApp(config)
File "/usr/lib/python2.5/site-packages/twisted/scripts/twistd.py", line 23, in runApp
_SomeApplicationRunner(config).run()
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line 330, in run
self.application = self.createOrGetApplication()
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line 416, in createOrGetApplication
application = getApplication(self.config, passphrase)
--- <exception caught here> ---
File "/usr/lib/python2.5/site-packages/twisted/application/app.py", line 427, in getApplication
application = service.loadApplication(filename, style, passphrase)
File "/usr/lib/python2.5/site-packages/twisted/application/service.py", line 368, in loadApplication
application = sob.loadValueFromFile(filename, 'application', passphrase)
File "/usr/lib/python2.5/site-packages/twisted/persisted/sob.py", line 214, in loadValueFromFile
exec fileObj in d, d
File "tahoe-key-generator.tac", line 7, in <module>
k = key_generator.KeyGeneratorService(2048)
File "/usr/lib/python2.5/site-packages/allmydata/key_generator.py", line 82, in __init__
self.tub = foolscap.Tub(certFile=os.path.join(self.basedir, 'key_generator.pem'))
File "/usr/lib/python2.5/posixpath.py", line 62, in join
elif path == '' or path.endswith('/'):
exceptions.AttributeError: 'int' object has no attribute 'endswith'
Failed to load application: 'int' object has no attribute 'endswith'
unknown (tahoe-key-generator.tac) node probably not started
----------------------------------------------------------------------------
The create-stats-gatherer command works (though I have to find out
what is't actually used for, as well as the create-key-generator.
Maybe I'll draw a nice little sheet with the basics about upload
helper, access/storage nodes, introducer, etc. one of these days.)
MfG, JBG
--
Jan-Benedict Glaw jbglaw at lug-owl.de +49-172-7608481
Signature of: Fortschritt bedeutet, einen Schritt so zu machen,
the second : daß man den nächsten auch noch machen kann.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://allmydata.org/pipermail/tahoe-dev/attachments/20081129/807131e2/attachment.pgp
More information about the tahoe-dev
mailing list