#1916 new defect

Folder healthy, but still get 410 Gone

Reported by: PRabahy Owned by: davidsarah
Priority: normal Milestone: undecided
Component: code-mutable Version: 1.9.2
Keywords: mutable publish heisenbug Cc:
Launchpad Bug:

Description (last modified by zooko)

I tried to add copy an item to the public folder on the public grid but received an error. I am able to "ls" the directory and when I run "check" it says it is healthy, but when I try to upload it errors out.

Trace from command line:

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe ls public:
DNSBench.exe
PreviousGridPublicDirectory
ThisDirectoryWritecap-RecursiveLOL
bitcoin-0.7.2-win32-setup.exe
diskcryptor.7z
multibit-0.4.19-windows.exe
python-2.7.3.msi
test_for_martin
thanks!.txt

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe check --repair public:
Summary: healthy
 storage index: txm5k7xe52cw3d4kny372i46ly
 good-shares: 10 (encoding is 1-of-10)
 wrong-shares: 0

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe cp "C:\Users\paul.rabahy\Downloads\Provable Data Possession at !UntrustedStores.pdf" public:
Error examining target directory: 410 Gone
UnrecoverableFileError: the directory (or mutable file) could not be retrieved, because there were insufficient good shares. This might indicate that no servers were connected, insufficient servers were connected, the URI was corrupt, or that shares have been lost due to server departure, hard drive failure, or disk corruption. You should perform a filecheck on this object to learn more.

Attachments (2)

incident-2013-02-12--19-13-04Z-ejgqrja.flog (588.0 KB) - added by PRabahy at 2013-02-12T19:17:56Z.
incident-2013-02-12--20-08-28Z-y7lgyki.flog (621.9 KB) - added by PRabahy at 2013-02-12T20:09:15Z.

Download all attachments as: .zip

Change History (11)

comment:1 Changed at 2013-02-08T14:40:14Z by PRabahy

  • Description modified (diff)

Improved readability of ticket.

comment:2 Changed at 2013-02-10T12:15:26Z by zooko

  • Description modified (diff)

escape "wiki words"

comment:3 Changed at 2013-02-10T12:16:20Z by zooko

  • Description modified (diff)

quote literals

comment:4 Changed at 2013-02-10T12:32:43Z by zooko

Dear PRabahy:

Thank you for reporting this!

This seems like a bug which has pretty bad consequences for availability. It doesn't ring a bell -- I don't remember seeing this sort of misbehavior reported before. Is it consistently reproducible, or does the behavior sometimes vary? Are there any incident report files in the gateway's base directory? Please see wiki:HowToReportABug. Thanks!

comment:5 Changed at 2013-02-11T13:40:00Z by PRabahy

The bug was reproducible at the time. I tried the upload several times before I ended up the with trace that I posted above. Unfortunately, I just tried it again and now the "cp" works just fine now.

I don't see any incident reports and have already restarted the node. If it happens again, I will make sure to grab/post a log.

comment:6 Changed at 2013-02-11T21:38:07Z by davidsarah

  • Component changed from unknown to code-mutable
  • Keywords mutable publish heisenbug added

It's quite possible that a modification to the public directory by another gateway resolved whatever condition was causing the modification by PRabahy's gateway to fail. In that case, I'm not very hopeful of finding out what was wrong :-(

comment:7 Changed at 2013-02-12T19:32:15Z by PRabahy

I think it is happening again. I don't know if this is relevant or not, but I noticed that the node that originally made the directory is offline.

This time, I can do "ls", but "stat" and "cp" are returning 410.

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe ls public:
CryptoResearch
DNSBench.exe
FileZilla_3.6.0.2_win32-setup.exe
bitcoin-0.7.2-win32-setup.exe
cahewson-test
multibit-0.4.19-windows.exe
polipo.1
python-2.7.3.msi
test.jpg
test_for_martin
test_for_martin-readonly

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe check public:
Summary: Unhealthy: 7 shares (enc 1-of-10)
 storage index: txm5k7xe52cw3d4kny372i46ly
 good-shares: 7 (encoding is 1-of-10)
 wrong-shares: 0

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe stats public:
ERROR: 410 Gone
UnrecoverableFileError: the directory (or mutable file) could not be retrieved,
because there were insufficient good shares. This might indicate that no servers
 were connected, insufficient servers were connected, the URI was corrupt, or th
at shares have been lost due to server departure, hard drive failure, or disk co
rruption. You should perform a filecheck on this object to learn more.

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe cp C:\Windows\Med
ia\onestop.mid public:
Error examining target directory: 410 Gone
UnrecoverableFileError: the directory (or mutable file) could not be retrieved,
because there were insufficient good shares. This might indicate that no servers
 were connected, insufficient servers were connected, the URI was corrupt, or th
at shares have been lost due to server departure, hard drive failure, or disk co
rruption. You should perform a filecheck on this object to learn more.

I am trying to run "check --repair" but it appears to have hung for about 10-15 minutes, so I'm not sure if it is stuck.

comment:8 Changed at 2013-02-12T20:08:21Z by PRabahy

"check --repair" finally finished. It said that it was successful, but that the directory was still unhealthy (bug?). I then ran "ls" and another "check"

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe check --repair pu
blic:
Summary: not healthy
 storage index: txm5k7xe52cw3d4kny372i46ly
 good-shares: 5 (encoding is 1-of-10)
 wrong-shares: 0
 repair successful

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe ls public:
CryptoResearch
DNSBench.exe
FileZilla_3.6.0.2_win32-setup.exe
bitcoin-0.7.2-win32-setup.exe
cahewson-test
multibit-0.4.19-windows.exe
polipo.1
python-2.7.3.msi
test.jpg
test_for_martin
test_for_martin-readonly

C:\Users\paul.rabahy\Downloads\allmydata-tahoe-1.9.2\bin>tahoe check public:
Summary: Unhealthy: multiple versions are recoverable
 storage index: txm5k7xe52cw3d4kny372i46ly
 good-shares: 10 (encoding is 1-of-10)
 wrong-shares: 5

Every time I check, there are 2 storage nodes that appear online (according to the WUI).

comment:9 Changed at 2013-02-13T19:16:53Z by zooko

Good job capturing evidence when the problem recurred, PRabahy!

Note: See TracTickets for help on using tickets.