#1429 closed enhancement (fixed)

automatically upload a file when it is put in a given local directory — at Version 12

Reported by: davidsarah Owned by:
Priority: major Milestone: 1.9.0
Component: code-frontend Version: 1.8.2
Keywords: drop-upload inotify usability review-needed news Cc:
Launchpad Bug:

Description (last modified by davidsarah)

During the Tahoe-LAFS summit, I (David-Sarah) implemented a prototype of a dropbox-like uploader: if you write a file into a given directory, it will upload it to a given Tahoe mutable directory (with the same name as its name in the local filesystem).

Its current limitations are:

  • it only handles one directory, not including subdirectories
  • the behaviour when multiple files are put in the directory (or the same file more than once) has not been well-tested so far
  • it doesn't have unit tests or docs (I will write those)
  • it must be given the URI of a mutable directory to upload to; it would be more usable to allow that to be a path that may start with an alias
  • it uses the twisted.internet.inotify API, which depends on Twisted 10.1 and is only supported on platforms with inotify.

Change History (16)

comment:1 Changed at 2011-07-16T16:28:10Z by davidsarah

  • Description modified (diff)
  • Status changed from new to assigned

comment:2 Changed at 2011-07-16T16:30:22Z by davidsarah

  • Description modified (diff)
  • Keywords inotify usability added

Changed at 2011-07-16T16:44:07Z by davidsarah

Prototype implementation of drop-upload from Tahoe-LAFS summit. No tests or docs. (Corrected to include new file.)

comment:3 Changed at 2011-07-16T16:45:47Z by davidsarah

  • Keywords design-review-needed added
  • Owner davidsarah deleted
  • Status changed from assigned to new

comment:4 Changed at 2011-07-19T00:54:15Z by davidsarah

http://tahoe-lafs.org/trac/tahoe-lafs/attachment/ticket/1435/dependency-updates.darcs.patch updates the dependency for Twisted to 10.1, which ensures that twisted.internet.inotify is available. (Note that it conflicts with drop-upload.darcs.patch. I'll rebase the latter when I have tests and docs.)

comment:5 Changed at 2011-07-22T05:26:54Z by david-sarah@…

In 8b4082677477daf1:

Update the dependency on Twisted to >= 10.1. This allows us to simplify some documentation: it's no longer necessary to install pywin32 on Windows, or apply a patch to Twisted in order to use the FTP frontend. fixes #1274, #1438. refs #1429

comment:6 follow-up: Changed at 2011-07-22T20:40:32Z by zooko

Here is a review of attachment:drop-upload.darcs.patch .

First of all: yay! This feature is neat. :-)

  • What is the "FIXME: Unicode paths"? I don't see what needs to change there to handle non-ASCII paths.
  • Why are we catching Exception--was it just because we might have Twisted < 10.1? We should remove any checks for if we have Twisted < 10.1 (except, of course, for the checks we do for all dependencies in src/allmydata/__init__.py).
  • "FIXME: error reporting" sounds like a serious incomplete part, but what exactly should be added? Logging to the Tahoe-LAFS logging system of certain sorts of failures? What could fail here?
  • Perhaps the logging in _notify() is too verbose.
  • the "dump event to file" part of _notify() is just for debugging and should be removed.
  • "FIXME: what happens if we get another event for the same file while it is uploading?" Well, I looked at the code of dirnode.add_file() and Uploader.upload() and I guess it just redundantly uploads it a second time. (See also #952, which is about what happens when you upload the bytewise-identical file more than once at a time through the WAPI.) What should it do? Well, what should the user experience look like? You drag and drop a file into this directory, and then you drag and drop a new version of it one second later, and then what? Maybe the visual display of the directory should have been updated as soon as you dropped the first one to show that it is in the middle of uploading that file. Then, if you drop another version on it, the display should update to show that there is another update scheduled to happen after this one completes. So I guess we need a new ticket for a GUI frontend to this functionality (integrated into some extant GUI like Nautilus, I suppose). But at this layer, I guess what it should do is schedule another upload for as soon as this upload is finished unless another such upload is already scheduled, in which case it should do nothing.
  • "FIXME: should all notified events cause an upload?": from http://www.kernel.org/doc/man-pages/online/pages/man7/inotify.7.html and http://inotify.aiken.cz/?section=inotify&page=faq&lang=en it looks like we should respond to IN_CLOSE_WRITE | IN_CREATE | IN_MOVE_SELF | IN_MOVE_TO
  • Also of course the other FIXMEs deserve to be investigated more. :-)

Other than that, it looks good. It is neat how few lines of code are needed to add this functionality! :-)

comment:7 in reply to: ↑ 6 ; follow-up: Changed at 2011-07-22T21:03:44Z by davidsarah

Replying to zooko:

Here is a review of attachment:drop-upload.darcs.patch .

First of all: yay! This feature is neat. :-)

Thanks :-)

  • What is the "FIXME: Unicode paths"? I don't see what needs to change there to handle non-ASCII paths.

tahoe.cfg is encoded in UTF-8, but the path wants to be in sys.getfilesystemencoding().

  • Why are we catching Exception--was it just because we might have Twisted < 10.1? We should remove any checks for if we have Twisted < 10.1 (except, of course, for the checks we do for all dependencies in src/allmydata/__init__.py).

If there is any exception here, the node will silently fail to start. (The same is true of exceptions during initialization of the other frontends.) We should fix that in a better way -- it seems relevant to #355 and #1360 for example -- but I wanted to avoid this failure mode while I was debugging.

  • "FIXME: error reporting" sounds like a serious incomplete part, but what exactly should be added? Logging to the Tahoe-LAFS logging system of certain sorts of failures? What could fail here?

We should raise an exception with a proper message (and ensure that gets logged) rather than failing an assert.

  • Perhaps the logging in _notify() is too verbose.
  • the "dump event to file" part of _notify() is just for debugging and should be removed.

Yes, it was just easier to see the events that way while prototyping. Logging to the Twisted log is sufficient, and the message could be shorter.

  • "FIXME: what happens if we get another event for the same file while it is uploading?" Well, I looked at the code of dirnode.add_file() and Uploader.upload() and I guess it just redundantly uploads it a second time.

Yes, I think this is fairly harmless, modulo the fact that we should probably be using a more restrictive event mask.

(See also #952, which is about what happens when you upload the bytewise-identical file more than once at a time through the WAPI.) What should it do? Well, what should the user experience look like? You drag and drop a file into this directory, and then you drag and drop a new version of it one second later, and then what? Maybe the visual display of the directory should have been updated as soon as you dropped the first one to show that it is in the middle of uploading that file. Then, if you drop another version on it, the display should update to show that there is another update scheduled to happen after this one completes. So I guess we need a new ticket for a GUI frontend to this functionality (integrated into some extant GUI like Nautilus, I suppose). But at this layer, I guess what it should do is schedule another upload for as soon as this upload is finished unless another such upload is already scheduled, in which case it should do nothing.

Right, I'll open another ticket about suppressing redundant uploads.

That looks about right, although IN_MOVE_SELF should be treated differently, since it's moving the local directory away from its current path. I don't know what that should do -- probably the safest thing is to disable the drop-upload feature entirely until the node is restarted (and say "don't do that" in the docs).

  • Also of course the other FIXMEs deserve to be investigated more. :-)

Other than that, it looks good. It is neat how few lines of code are needed to add this functionality! :-)

Yeah, twisted.internet.inotify rocks!

comment:8 in reply to: ↑ 7 Changed at 2011-07-23T04:52:05Z by davidsarah

Replying to davidsarah:

Replying to zooko:

  • What is the "FIXME: Unicode paths"? I don't see what needs to change there to handle non-ASCII paths.

tahoe.cfg is encoded in UTF-8, but the path wants to be in sys.getfilesystemencoding().

Also, in _notify the conversion to a Tahoe Unicode filename is wrong (unicode(filepath.basename()) will use the default ASCII encoding, when it should be converting from sys.getfilesystemencoding()).

Changed at 2011-07-25T04:40:02Z by davidsarah

Drop-upload frontend, with tests (but no documentation). refs #1429

comment:9 Changed at 2011-07-25T04:44:30Z by davidsarah

  • Keywords review-needed added; design-review-needed removed
  • Milestone changed from soon to 1.9.0

The tests could do with covering a few more corner cases and error conditions, but I'm pretty sure there will be enough time for that before the beta.

comment:10 follow-up: Changed at 2011-07-25T06:14:03Z by warner

  • Type changed from defect to enhancement

This is pretty cool stuff. I'm hesitant about landing it in Tahoe core, though.. at least in its present form, it feels more like something that wants to be in a plugin, or in an extensions/ sort of directory. If it were a more complete Dropbox-ish replacment, I'd feel better about it: watching all directories under a root, handling modification of existing files not just new ones, and ideally some kind of multiple-client support. (as is, it's more like an automatic backup tool than a sync-a-virtual-directory-across-multiple-machines tool).

Can you envision maintaining its current UI for a couple years? Or do you think you'll look at it in six months and go "oh, that should really look like X instead".

comment:11 in reply to: ↑ 10 Changed at 2011-07-25T12:24:40Z by davidsarah

Replying to warner:

This is pretty cool stuff. I'm hesitant about landing it in Tahoe core, though.. at least in its present form, it feels more like something that wants to be in a plugin, or in an extensions/ sort of directory.

We don't have a plugin mechanism. In any case, I'm not encouraged by the fate of things that were previously out-of-core, like the FUSE implementations. I'd prefer it to be in the core, tested by default, and available for all users to try out without installing anything else.

If it introduced new dependencies that would be a different matter, but it doesn't (and we'd decided that it was OK to depend on Twisted 10.1, which has other advantages).

If it were a more complete Dropbox-ish replacment, I'd feel better about it: watching all directories under a root, handling modification of existing files not just new ones, and ideally some kind of multiple-client support. (as is, it's more like an automatic backup tool than a sync-a-virtual-directory-across-multiple-machines tool).

I didn't intend it to be a sync-a-virtual-directory-across-multiple-machines tool. It just uploads things that you drop into a particular directory. Perhaps the name 'drop-upload' suggests that it is more similar to Dropbox than it is. We have time before the 1.9 release to rename it, if a better name is suggested.

It already handles modification of existing files (I'll add that to the tests). Watching all directories under a root is #1433, and is a relatively straightforward evolution of the current code.

Can you envision maintaining its current UI for a couple years?

Yes, absolutely. It'll be easy to maintain compatibility with the current UI, that's only a few lines of code and documentation. It also needs to be in core to get sufficient feedback on the UI to improve it.

comment:12 Changed at 2011-07-27T01:07:04Z by davidsarah

  • Description modified (diff)

Changed at 2011-07-27T01:08:57Z by davidsarah

attachment:drop-upload-3.darcs.patch fixes some bugs in error paths, and has better test coverage. The tests also pass on Windows now.

Changed at 2011-07-27T03:31:46Z by davidsarah

drop-upload: make counts visible on the statistics page, and disable some debugging. refs #1429

Note: See TracTickets for help on using tickets.