#1304 closed defect (fixed)

'tahoe cp' copying to a mutable file should replace the contents

Reported by: davidsarah Owned by: kevan
Priority: major Milestone: 1.9.0
Component: code-frontend-cli Version: 1.8.1
Keywords: tahoe-cp usability Cc:
Launchpad Bug:

Description

When tahoe cp copies onto an existing mutable file (in a writable directory), it replaces the directory entry. This is not consistent with SFTP, which would replace the contents of the file. Replacing the contents is more similar to a conventional filesystem, and I think more likely to be expected by users who are using mutable files.

Notice that TahoeFileTarget in src/allmydata/scripts/tahoe_cp.py@4768#L184 never uses the mutable variable passed into its constructor.

Although the comment there mentions #835, this is not the same bug. #835 is about adding options to allow specifying whether directories are copied as mutable or immuutable.

Attachments (3)

1304.darcspatch (27.2 KB) - added by kevan at 2011-07-26T19:24:16Z.
1304.2.darcspatch (33.8 KB) - added by kevan at 2011-07-29T17:10:54Z.
added error test
1304.3.darcspatch (33.8 KB) - added by kevan at 2011-07-29T20:32:20Z.
fixed typo

Download all attachments as: .zip

Change History (23)

comment:1 Changed at 2011-01-13T08:53:15Z by davidsarah

There's also a case for making tahoe cp pay attention to the no-write flag introduced in #1063. It's only a fairly weak case though, because that flag was added for the benefit of clients who are expecting permissions semantics as close as possible to a POSIX filesystem (e.g. clients using sshfs), and users of tahoe cp are probably not expecting that.

comment:2 Changed at 2011-01-14T14:28:11Z by gdt

I would expect copy with an existing target to write a new file in the same directory with a temporary name and then call the rename() system call. rename guarantees that either the old target or the new will still exist, even during a system crash (obviously those are local semantics). However, I see that NetBSD cp opens the file for writing. mv behaves as I expect, and I am pretty sure rsync does too.

comment:3 Changed at 2011-07-16T20:58:15Z by kevan

  • Owner set to kevan
  • Status changed from new to assigned

comment:4 Changed at 2011-07-16T20:58:51Z by kevan

I've been working on this, and I hope to have it fixed by 1.9.

Changed at 2011-07-26T19:24:16Z by kevan

comment:5 follow-up: Changed at 2011-07-26T19:25:03Z by kevan

In the weekly phonecall of two weeks ago (I think), I mentioned that I thought this was a bug in the webapi. That is wrong -- I misunderstood how tahoe cp assigns sources to targets.

I think that the fact that tahoe cp will replace a mutable file with an immutable file is pretty unintuitive, and that's what it does now. Replacing the old contents with new contents also helps tahoe cp preserve use cases that rely on mutable files in directories having consistent caps after updates.

How do we want to behave when tahoe cp's authority is inadequate to replace the contents of the target? I like failure in that case; I don't like the idea of falling back to directory modification from content replacement for mutable files, since they're pretty different behaviors.

I've attached a patch. I'll set 'review-needed' once we've decided what to do when the target mutable file isn't writable and I've written a test for that.

comment:6 Changed at 2011-07-27T16:23:09Z by kevan

  • Keywords design-review-needed added

comment:7 in reply to: ↑ 5 Changed at 2011-07-27T16:29:41Z by davidsarah

  • Owner changed from kevan to davidsarah
  • Status changed from assigned to new

Replying to kevan:

How do we want to behave when tahoe cp's authority is inadequate to replace the contents of the target? I like failure in that case; I don't like the idea of falling back to directory modification from content replacement for mutable files, since they're pretty different behaviors.

+1, that case should be an error.

I've attached a patch. I'll set 'review-needed' once we've decided what to do when the target mutable file isn't writable and I've written a test for that.

I'll review what you have so far.

comment:8 Changed at 2011-07-27T16:30:04Z by davidsarah

  • Status changed from new to assigned

comment:9 Changed at 2011-07-28T23:19:45Z by davidsarah

  • Keywords design-review-needed removed
  • Owner changed from davidsarah to kevan
  • Status changed from assigned to new

Excellent tests! Very comprehensive. Now this just needs the test for when the target file isn't writeable.

comment:10 follow-up: Changed at 2011-07-29T17:10:00Z by kevan

Thanks for the review, davidsarah. I've attached another patch with a new test.

The error message that we test for is a webapi error message. That seems a bit lazy -- it didn't require any code changes, for example -- and it's not the most informative error message I've ever seen. The only way I've thought of to do better is to teach tahoe cp to inspect the error response from the webapi if it gets one so it can append an explanation to the output, and I'm not sure I like the idea of coupling tahoe cp to the exception messages printed by the webapi in that way.

(another initially attractive option is a sanity check performed on tahoe cp's targetmap before copying starts. Based on this, we could either abort the copy operation or print a warning if we found a readonly mutable target. All of that would rely upon JSON gathered from a different HTTP request than the request(s) actually modifying data, though, so any check would be inherently unreliable. Given that, I don't think I'd want to stop the copy operation if a readonly target was found, since I wouldn't want users to get the idea that we could do that reliably and I wouldn't want to annoy users if that fact got in their way, but I'm not as against printing a warning and then continuing, though at that point we might as well just figure out a way to write a better error message after the fact, since that solves the problem in more cases)

Changed at 2011-07-29T17:10:54Z by kevan

added error test

comment:11 Changed at 2011-07-29T19:07:16Z by davidsarah

Typo in comment: "# Store the URIs for later user." -> "later use".

When running the tests I get:

FAIL]: allmydata.test.test_cli.Cp.test_cp_overwrite_readonly_mutable_file

Traceback (most recent call last):
  File "/home/davidsarah/tahoe/trunk/src/allmydata/test/test_cli.py", line 2150, in _check_error_message
    self.failUnlessIn("need write capability to publish", err)
twisted.trial.unittest.FailTest: 'need write capability to publish' not in 'Error during PUT: 400 Bad Request\nPUT to a mutable file: replace or update requested with read-only cap\n'

The 400 response has a reasonable error message -- is it just the test that needs changing, or did you get a different message?

comment:12 in reply to: ↑ 10 Changed at 2011-07-29T19:11:00Z by davidsarah

Replying to kevan:

(another initially attractive option is a sanity check performed on tahoe cp's targetmap before copying starts. Based on this, we could either abort the copy operation or print a warning if we found a readonly mutable target. All of that would rely upon JSON gathered from a different HTTP request than the request(s) actually modifying data, though, so any check would be inherently unreliable.

Yes, that's a general problem with trying to make distributed operations have all-or-nothing semantics. Anyway, even if this could be improved, we don't need to do so for 1.9.

comment:13 follow-up: Changed at 2011-07-29T20:31:28Z by kevan

I got a different error message:

Error during PUT: 500 Internal Server Error
[...]
exceptions.AssertionError: need write capability to publish

Are you testing with my #393 patches applied? It looks like the message you see is something that I added in #393, but which isn't in trunk yet. I like the #393 400 more than the 500 error, though, and we'll have to change the test anyway once #393 is landed. I could change it in my patch now and we could land #1304 once #393 is landed so there isn't a spurious test failure. Alternatively, we could leave the patch as-is, land it now, and then change the test once #393 is landed.

Thanks for spotting the typo; I'll upload a fresh patch.

Changed at 2011-07-29T20:32:20Z by kevan

fixed typo

comment:14 in reply to: ↑ 13 Changed at 2011-07-29T20:38:24Z by davidsarah

Replying to kevan:

I got a different error message:

Error during PUT: 500 Internal Server Error
[...]
exceptions.AssertionError: need write capability to publish

Are you testing with my #393 patches applied?

Yes, that was the problem.

Alternatively, we could leave the patch as-is, land it now, and then change the test once #393 is landed.

+1.

comment:15 Changed at 2011-07-29T20:38:46Z by davidsarah

  • Keywords reviewed added

comment:16 Changed at 2011-07-30T03:19:36Z by davidsarah

  • Keywords reviewed removed
  • Resolution set to fixed
  • Status changed from new to closed

Fixed in 448278e807e489e4.

comment:17 Changed at 2011-07-30T04:09:09Z by davidsarah

  • Resolution fixed deleted
  • Status changed from closed to reopened

This seems to have caused a test failure on Windows: http://tahoe-lafs.org/buildbot/builders/FreeStorm%20WinXP-x86%20py2.6/builds/539/steps/test/logs/stdio

It seems that a command line URI argument is being passed as Unicode when it should be a str. I can't immediately see why it only fails on Windows.

comment:18 Changed at 2011-07-30T04:14:35Z by davidsarah

Oh, I see. simplejson.loads doesn't consistently return the same type, it returns either unicode or str depending on its version and the phase of the moon. For example,

self.rw_uri = data["rw_uri"]

should be

self.rw_uri = to_str(data["rw_uri"])

where to_str is defined in src/allmydata/util/encodingutil.py (similarly for all the URIs read from JSON).

comment:19 Changed at 2011-07-30T04:24:04Z by david-sarah@…

In 16fd14a78ad86536:

test_cli.py: use to_str on fields loaded using simplejson.loads in new tests. refs #1304

comment:20 Changed at 2011-07-30T04:39:38Z by davidsarah

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.