Opened at 2007-11-30T21:10:48Z
Closed at 2008-01-28T19:06:04Z
#218 closed enhancement (fixed)
resumption of incomplete transfers
Reported by: | zooko | Owned by: | warner |
---|---|---|---|
Priority: | major | Milestone: | 0.8.0 (Allmydata 3.0 Beta) |
Component: | code-network | Version: | 0.7.0 |
Keywords: | upload download partial | Cc: | |
Launchpad Bug: |
Description
Peter mentioned to me that an important operational issue is resumption of large file transfers that are interrupted by network flapping.
To do this, we change storage servers so that they no longer delete the "incoming" data that was incompletely uploaded, on detection of a connection break. Then we extend the upload protocol so that uploaders learn about which blocks of a share are already present on the server and they don't re-upload those blocks.
Likewise on download.
Change History (6)
comment:1 Changed at 2007-12-31T21:25:58Z by warner
comment:2 Changed at 2008-01-05T03:53:11Z by warner
- Milestone changed from undecided to 0.9.0
comment:3 Changed at 2008-01-08T23:26:30Z by warner
I believe (correct me if I'm wrong) the current thinking is that this feature will be provided through the Offloaded Uploader (#116), operating in a spool-to-disk-before-encode mode.
The idea is that the client (who has a full copy of the file and has done one read pass to compute the encryption key and storage index) sends the SI to the helper, which checks the appropriate storage servers and either says "it's there, don't send me anything", "it isn't there, send me all your crypttext", or "some of it is here on my local disk, send me the rest of the crypttext". In the latter case, the helper requests the byte-range that it still needs, repeating as necessary until it has the whole (encrypted) file on the helper's disk. Then the helper encodes and pushes the shares. We assume that the helper is running in a well-managed environment and neither gets shut down frequently nor does it lose network connectivity to the storage servers frequently. The helper is also much closer to the storage servers, network-wise, so it is ok if an upload must be restarted as long as the file doesn't have to be transferred over the home user's (slow) DSL line multiple times.
This provides for the resume-interrupted-upload behavior for home users that are running their own node (when using the Offloaded Uploader helper). This does not help users who are running a plain web browser (and thus uploading files with HTTP POSTs to an external web server).. to help a web browser, we'd need an Active-X application or perhaps Flash or something. It also doesn't help friendnet installations that do not have a helper node running closer to the storage servers than the client. This seems like an acceptable tradeoff.
comment:4 Changed at 2008-01-09T01:06:11Z by warner
- Milestone changed from 0.9.0 (Allmydata 3.0 final) to 0.8.0 (Allmydata 3.0 Beta)
as I read the milestones, this belongs in 0.8.0
comment:5 Changed at 2008-01-24T00:21:37Z by warner
- Owner set to warner
- Status changed from new to assigned
comment:6 Changed at 2008-01-28T19:06:04Z by warner
- Resolution set to fixed
- Status changed from assigned to closed
Ok, this is now complete in the CHK upload helper. Clients which use the helper will send their ciphertext to the helper, where it gets stored in a holding directory (BASEDIR/helper/CHK_incoming/) until it is complete. If the client is lost, the partial data is retained for later resumption. When the incoming data is complete, it is moved to a different directorh (CHK_encoding/) and then the encode+push process begins.
The #116 helper is not complete (it still does not have support for avoiding uploads of file which are already present in the grid), but this portion of it is, so I'm closing out this ticket.
I think we still need some sort of answer for incomplete downloads, so I'm opening a new ticket for the download side (#288).
That's an important user-facing feature. There are a couple of different places where it might be implemented, some more appropriate that others. What matters to the user is that their short-lived network link be usable to upload or download large files.. they don't really care how exactly this takes place.
The three places where I can see this happening (on upload) are:
For download, things are a bit easier, since we can basically do random-access reads from CHK files, and the HTTP GET syntax can pass a Content-Range header that tells us which part of the file they want to read. We just have to implement support for that.
I'm probably leaning towards the third option (something above PUT), but it depends a lot upon what sort of deployment options we're looking at and which clients are stuck behind the flapping network link.