[tahoe-lafs-trac-stream] [Tahoe-LAFS] #2409: tolerate simultaneous uploads better
Tahoe-LAFS
trac at tahoe-lafs.org
Wed Apr 22 00:14:05 UTC 2015
#2409: tolerate simultaneous uploads better
---------------------------+---------------------------
Reporter: warner | Owner:
Type: defect | Status: new
Priority: normal | Milestone: undecided
Component: code-encoding | Version: 1.10.0
Keywords: | Launchpad Bug:
---------------------------+---------------------------
In the Nuts+Bolts meeting this morning, we discussed what would happen if
an application (in particular the "magic folder / drop-upload" feature)
were to upload two copies of the same file at the same time. We thought
about this a long time ago, but I can't seem to find a ticket on the
particular issue.
I believe there's a race condition on the storage servers which would make
the upload go less smoothly than we'd like. The first upload will see no
shares for each storage index, so it will allocate a !BucketWriter and
start writing the share. The second upload will compute the storage-index,
ask the server about pre-existing shares, and then.. probably get a yes?
The answer is uncertain, and depends upon the server implementation. The
server's read-side might look on disk for the partially-written files, or
the server's write-side might be using the write-to-tempfile atomic-swap
technique, or the read-side might be looking in a leasedb for evidence of
the share. Some of these will result in a "no" answer to the DYHB, in
which case the second upload will try to allocate new !BucketWriters to
fill the shares (which might fail because of the existing writers, or
might succeed with hilarious results as the two writers attempt to write
the same file with hopefully the same data). It might get a "yes", in
which case I think the uploader will ignore the shares and assume that
they'll be present in the future.
We should probably:
* nail down exactly what the server does in this situation
* change the Uploader to be more cautious about pre-existing shares
The Uploader could read the pre-existing shares as it goes, comparing them
against locally-generated ones. If they match, great, those shares can
count against the servers-of-happiness criteria. If they don't, or if they
aren't complete, then oops. The simplest way to deal with such problems is
to treat them like a share write that failed (as if the server
disconnected before the upload was complete), which may flunk the shares-
of-happiness test and mark the upload as failing. A more sophisticated
approach (which hopefully is ticketed elsewhere) is to have a second pass
which writes out a new copy of any share that wasn't successfully placed
during the first pass.
If we implement that verify-during-upload thing, we'll need to think
carefully about how simultaneous uploads ought to work. I think we'll need
a way to mark shares as "in-progress", which tells the second uploader
that they can't verify the share now, but maybe they shouldn't upload it
anyways.
This will get better when we make the storage-index be a hash of the share
(or the root of a merkle tree with the shares in the leaves), because then
the storage-index won't even be defined until the upload is complete, and
the intermediate in-progress state will disappear. Simultaneous uploads
will then turn into two uploads of the exact same share, detected at
`close()`, which is inefficient but sound, I think.
Related tickets:
* #610 "upload should take better advantage of existing shares"
* #643 make the storage index be the verifier cap
* #873 upload: tolerate lost of unacceptably slow servers
* #1288 support streaming uploads in uploader
* #1508 shortcut encryption and erasure coding when a file has already
been uploaded
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2409>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list