[tahoe-dev] configuring sqlite efficiency and durability
Brian
warner at lothar.com
Tue Dec 4 04:22:58 UTC 2012
On 12/3/12 6:13 PM, Zooko Wilcox-O'Hearn wrote:
> In fact, as David-Sarah pointed out today, since we update the leasedb
> and then write out the full share data before acking on file upload,
> the window of opportunity for this failure is probably zero on file
> upload.
Hm, we should probably analyze what happens with races in the "INCOMING"
state, as new shares are being written to disk or S3. If I remember the
plan correctly, we do:
1: leasedb.write state=INCOMING (atomic)
2: write share to storage (non-atomic)
3: leasedb.write state=PRESENT (atomic)
and if we ever see a surprising share in the INCOMING state, we assume
that it's partially-written and should therefore delete it.
If a kernel crash causes the leasedb to rollback to state=INCOMING after
step 3, then I guess we'd delete a perfectly valid share. If it rolls
back all the way to step 0 (i.e. no entry in the leasedb), then the
surprise share would get a starter lease, right? Both seem ok: the worst
case is that we spontaneously lose whatever shares were in flight or
very new at the time of a kernel crash. Except for the
not-honoring-our-ACK thing, we could almost pretend that the
interruption happened before the share was fully received.
Is there a similar analysis to be done on the way down? Like, we mark
the share as OUTGOING, delete it, then crash? I suspect all the cases
are safe there too, but maybe it'd be a good idea to walk through them.
> So in short, I'm in favor of using synchronous=NORMAL,
> journal_mode=WAL for all current uses of sqlite in Tahoe-LAFS. Let me
> know if you see a flaw in this reasoning!
I'm +0 on that. 3.2 updates-per-second is intolerably slow, so we gotta
do *something*. It'd be nice if some easy DB-subroutine shuffling could
speed that up a bit (first by getting it down to one commit per remote
operation, 1/3.2s=340ms is still a lot of seeks), but I know we'd
probably have to sacrifice code clarity to get to that point, and I'd
rather give up on unnecessary leasedb durability than make that code
unreadable (and thus unreliable). So I say go for it.
cheers,
-Brian
More information about the tahoe-dev
mailing list