[tahoe-dev] Surprise shares
Brian Warner
warner-tahoe at allmydata.com
Tue Dec 9 14:41:20 PST 2008
On Tue, 09 Dec 2008 22:21:40 +0100
Francois Deppierraz <francois at ctrlaltdel.ch> wrote:
> Great, thanks Brian for sorting this out. Setting the culprit to
> readonly temporarily fixed this issue.
Cool. Just so you know, that "readonly" switch only controls immutable
shares, not mutable ones. If the "readonly" switch is ON, then when a storage
server receives a request to hold immutable shares, its response never
accepts any of them. Since the total volume of immutable files is usually
much larger than that of mutable files (since immutable is the default, and
since mutable files are currently limited to 3.5MB), this usually reduces the
storage-consumption rate to a small trickle.
Mutable shares are still accepted in the so-called "readonly" state, both new
shares and modifications of old ones. Again, since these tend to be small,
the disk consumed by them isn't usually very much. We use a 20GB
"reservation" (switching to readonly when we got down to 20GB of free space)
on our prodnet, and we figure that will give us at least a year of mutable
share traffic before we use up that 20GB. By that point, we should have a
better solution for the problem :).
So setting the server to "readonly" and also freeing up enough disk space (or
raising the quota) should mitigate this problem for now. Merely setting the
full server to readonly (without doing something to allow small writes to
succeed) probably won't.
Oh, also, in thinking about the bug some more, and assuming that the mutable
file publish was triggered by a directory modification, what I suspect
will/did happen is:
* first publish fails with UCWE due to the self-inflicted "surprise shares"
* note however that almost all of the shares have been updated
* dirnode modifier notices the UCWE and tries again
* second publish fails in exactly the same way. Repeat 3 or 4 times
* eventually the dirnode modify fails, and your 'add child' or 'delete
child' webapi request fails
* the next time you read that directory, you'll see the new version
* subsequent publish attempts will fail in the same way, until the server is
changed to not emit errors
thanks so much for helping us test this behavior!
cheers,
-Brian
More information about the tahoe-dev
mailing list