[tahoe-dev] problem with tahoe 1.9.1: deep-check crashes on unhealthy empty directories
Johannes Nix
Johannes.Nix at gmx.net
Tue Feb 14 21:40:28 UTC 2012
Hi,
First some big thanks for this great project... I think such
a base for a more decentralized web is very much needed.
I was playing around with TestGrid and had some problem with
recovering unhealthy files. When I ran "tahoe deep-check --repair",
I got an exception at some point. I tried the wolf fence
approach to find the object in question and ended up with an
empty directory.
So my first assumption is that deep-check crashes on unhealthy
empty directories.
To test this assumption, I tried to generate unhealthy empty directories
and deep-checked them (see long terminal output below).
You see that in this try, deep-check failed even without the
--repair option (which is different from the behaviour that I
saw before.) I tried to place an empty file in the directory,
and this balked, too; apparently, deep-check does not
consider that the dir was created with a different redundancy
parameter and becomes pretty confused. However, "tahoe check" succeeded.
I don't really understand the code, but my naive guess
is that something goes wrong in tahoe_check.py in
DeepCheckStreamer.run() around line 276.
Hope this is helpful,
Cheers,
Johannes
jnix at colibri:~/.tahoe$ tahoe create-alias test:
Alias 'test' created
jnix at colibri:~/.tahoe$ vi tahoe.cfg # changed shares.* to value below
jnix at colibri:~/.tahoe$ grep ^shares tahoe.cfg
shares.needed = 3
shares.happy = 3
shares.total =3
jnix at colibri:~/.tahoe$ tahoe restart
STOPPING '/home/jnix/.tahoe'
process 11914 is dead
STARTING '/home/jnix/.tahoe'
jnix at colibri:~/.tahoe$ tahoe mkdir test:dir1
URI:DIR2:rvvpmqr4moeuuqblcr7hzrykfq:q3tscbg66cpg7sfv.................cu4cinudtfg2b22ln2a
jnix at colibri:~/.tahoe$ tahoe mkdir test:dir2
URI:DIR2:736yyxu7qfzk3ii7ptgca6zlie:3n6o6pxuh6ri26sb.................nmgxiz4vcrrp544gx2q
jnix at colibri:~/.tahoe$ touch 0
jnix at colibri:~/.tahoe$ tahoe put 0 test:dir2/0
201 Created
URI:LIT:
jnix at colibri:~/.tahoe$ vi tahoe.cfg # restoring parameters to default
jnix at colibri:~/.tahoe$ grep ^shares tahoe.cfg
shares.needed = 3
shares.happy = 4
shares.total = 10
jnix at colibri:~/.tahoe$ tahoe restart
STOPPING '/home/jnix/.tahoe'
process 29876 is dead
STARTING '/home/jnix/.tahoe'
jnix at colibri:~/.tahoe$ tahoe deep-check test:
ERROR: UnrecoverableFileError(no recoverable versions)
"[Failure instance: Traceback: <class
'allmydata.mutable.common.UnrecoverableFileError'>: no recoverable
versions" /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/base.py:796:runUntilCurrent /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/foolscap-0.6.3-py2.6.egg/foolscap/eventual.py:26:_turn
/usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:318:callback
/usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:424:_startRunCallbacks
--- <exception caught here> ---
/usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:441:_runCallbacks
/usr/local/lib/allmydata-tahoe-1.9.1/src/allmydata/mutable/filenode.py:401:_get_version
jnix at colibri:~/.tahoe$ tahoe deep-check test:
ERROR: UnrecoverableFileError(no recoverable versions)
"[Failure instance: Traceback: <class
'allmydata.mutable.common.UnrecoverableFileError'>: no recoverable
versions" /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/base.py:796:runUntilCurrent /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/foolscap-0.6.3-py2.6.egg/foolscap/eventual.py:26:_turn
/usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:318:callback
/usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:424:_startRunCallbacks
--- <exception caught here> ---
/usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:441:_runCallbacks
/usr/local/lib/allmydata-tahoe-1.9.1/src/allmydata/mutable/filenode.py:401:_get_version
jnix at colibri:~/.tahoe$ tahoe put 0 test:dir1/0
Error: 410 Gone
UnrecoverableFileError: the directory (or mutable file) could not be
retrieved, because there were insufficient good shares. This might
indicate that no servers were connected, insufficient servers were
connected, the URI was corrupt, or that shares have been lost due to
server departure, hard drive failure, or disk corruption. You should
perform a filecheck on this object to learn more.
jnix at colibri:~/.tahoe$ tahoe put 0 test:dir1/0 Error: 410 Gone
UnrecoverableFileError: the directory (or mutable file) could not be
retrieved, because there were insufficient good shares. This might
indicate that no servers were connected, insufficient servers were
connected, the URI was corrupt, or that shares have been lost due to
server departure, hard drive failure, or disk corruption. You should
perform a filecheck on this object to learn more.
jnix at colibri:~/.tahoe$ tahoe put 0 test:dir1/0 Error: 410 Gone
UnrecoverableFileError: the directory (or mutable file) could not be
retrieved, because there were insufficient good shares. This might
indicate that no servers were connected, insufficient servers were
connected, the URI was corrupt, or that shares have been lost due to
server departure, hard drive failure, or disk corruption. You should
perform a filecheck on this object to learn more.
jnix at colibri:~/.tahoe$ tahoe ls test: dir1 dir2 jnix at colibri:~/.tahoe$
tahoe put 0 test:dir1/0 Error: 410 Gone UnrecoverableFileError: the
directory (or mutable file) could not be retrieved, because there were
insufficient good shares. This might indicate that no servers were
connected, insufficient servers were connected, the URI was corrupt, or
that shares have been lost due to server departure, hard drive failure,
or disk corruption. You should perform a filecheck on this object to
learn more. jnix at colibri:~/.tahoe$ tahoe deep-check test:dir2 done: 2
objects checked, 2 healthy, 0 unhealthy jnix at colibri:~/.tahoe$ tahoe
deep-check test:dir1 ERROR: UnrecoverableFileError(no recoverable
versions) "[Failure instance: Traceback: <class
'allmydata.mutable.common.UnrecoverableFileError'>: no recoverable
versions" /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/base.py:796:runUntilCurrent /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/foolscap-0.6.3-py2.6.egg/foolscap/eventual.py:26:_turn /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:318:callback /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:424:_startRunCallbacks
--- <exception caught here>
--- /usr/local/lib/allmydata-tahoe-1.9.1/support/lib/python2.6/site-packages/Twisted-10.1.0-py2.6-linux-x86_64.egg/twisted/internet/defer.py:441:_runCallbacks /usr/local/lib/allmydata-tahoe-1.9.1/src/allmydata/mutable/filenode.py:401:_get_version
jnix at colibri:~/.tahoe$ tahoe check test:dir1 Summary: Healthy storage
index: oscuotd7ugdugkxs74ew7xmily good-shares: 3 (encoding is 3-of-3)
wrong-shares: 0 jnix at colibri:~/.tahoe$ tahoe deep-check test:dir1 done:
1 objects checked, 1 healthy, 0 unhealthy
jnix at colibri:~/.tahoe$ tahoe ls test:dir1
jnix at colibri:~/.tahoe$
More information about the tahoe-dev
mailing list