[tahoe-dev] Lock files in Tahoe directories

Brian Warner warner at lothar.com
Tue Jul 21 13:51:29 PDT 2009


Shawn Willden wrote:

> The idea is that I'll create a "lockfile directory", in which each
> client will write a lockfile named, say, "<clientid>.lock". This
> directory and its lockfiles will be populated in a controlled way so
> that no two clients are updating it at once. The lockfiles will all be
> empty to begin with.

> The big question is whether or not the delay in step 4 is sufficient
> and, indeed, whether *any* amount of delay is sufficient to guarantee
> that write conflicts cannot occur.

Nope. I think it's the CAP theorem that says you can't get both
consistency and availability in the presence of partitions. The closer
you get to perfection in one axis, the worse it gets in the other ones.
In vague terms, using mutable files to improve consistency in mutable
files is a percentage thing: if you assume some sort of probability
distribution of conflict given a certain write frequency and delay time,
then using one mutable file as a lock for another might get you a P1*P2
probability of conflict.

Using a coordination server is a lot simpler, and will get you perfect
consistency, at the expense of availability: if you can't reach the
coordination server, you aren't allowed to do anything. Also because
(fundamentally speaking) connections are merely a useful fiction created
out of non-guaranteed messages and timeouts, there are lots of failure
modes which could cause lost progress.

The complexity of this stuff is what prompted us to write down the Prime
Coordination Directive ("don't do that!") and move on to other
less-impossible tasks, like Accounting :-).

> I guess the best approach would be to implement a coordination server,
> as suggested by mutable.txt. Perhaps if someone who knows the codebase
> outlined where I should look to get started on that, mayabe I could do
> that. I would think coordination should be a service offered by a
> node, rather than a node type, probably enabled by setting
> "[coordinator]/enabled=true" in tahoe.cfg, similar to the helper
> config.

Yeah, definitely. We're certainly moving to a model in which you start a
generic "node", running one or more services, and a coordinator would be
one of those service types. (maybe use a more descriptive name..
"write-coordinator" ? "consistency-coordinator" ? I'm not sure.)

You could have each coordinator announce itself through the Introducer,
and then other clients could subscribe to hear about them. Or you could
have the coordinator-providing node simply write its FURL to a local
file (like we do with the helper) and then manually copy it into a
[client]coordinator.furl= entry in the clients who should use it. You'd
have to decide whether all mutable files will be accessed through the
coordinator, or if there should be some flag to enable/disable the
coordinator for each dirnode.add_child operation (it'd probably be
easier to add the coordinator FURL to the parent dirnode's edge metadata
for everything).

You might also want a tahoe.cfg setting to stop using the coordinator
altogether (even if the edge metadata said it should be used), if the
coordinator goes away and you want to regain availability at the expense
of consistency.

Also, it might be easier to just have a single client-wide coordinator
FURL, instead of putting a separate one into each mutable file object.
If all of your clients use the same one, then you'll get the same
consistency properties. We were only talking about having per-file
coordinators to make the system more scalable: if your grid only has
tens or hundreds of clients, it'd be simpler to just use a single static
coordinator.

cheers,
 -Brian


More information about the tahoe-dev mailing list