[tahoe-lafs-trac-stream] [Tahoe-LAFS] #2885: Magic-folder status API often fails to display queued upload operations
Tahoe-LAFS
trac at tahoe-lafs.org
Thu Jun 22 05:40:20 UTC 2017
#2885: Magic-folder status API often fails to display queued upload operations
--------------------------+---------------------------
Reporter: cypher | Owner:
Type: defect | Status: new
Priority: normal | Milestone: undecided
Component: unknown | Version: 1.12.1
Keywords: magic-folder | Launchpad Bug:
--------------------------+---------------------------
While working on an application that uses the magic-folder status API (in
this case, a monitor module for [http://gridsync.io Gridsync] that
displays and updates the total progress and time remaining of an in-
progress magic-folder sync operation), I've noticed that the status API
will often fail to display entries for queued operations and/or drop
previously-displayed entries from its output.
To demonstrate what I mean by this, I have a magic-folder already
configured and a simple python script that `pprint()`s the JSON output of
the magic-folder status API..
To start, I create five 100M junk files inside the local magic-folder
directory (by `dd`ing `/dev/urandom` into `file_1`, `file_2`, and so on),
simulating the behavior of a user that might dump a handful of large-ish
files into a magic-folder.
Later, I send an authenticated POST request to magic-folder status API and
get back the following output:
{{{
[{'kind': 'upload',
'path': 'file_2',
'percent_done': 0,
'queued_at': 1498095903.754096,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_3',
'percent_done': 0,
'queued_at': 1498095913.339961,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_4',
'percent_done': 0,
'queued_at': 1498095922.907665,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_5',
'percent_done': 0,
'queued_at': 1498095932.553877,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_1',
'percent_done': 54.37499999999999,
'queued_at': 1498095893.985629,
'started_at': 1498095894.420489,
'status': 'started'}]
}}}
This is exactly the output that I'd expect to see: `file_1` is just over
half-way uploaded and `file_2` through `file_5` are sitting in the
`queued` state (i.e., pending upload). If I know the filesize of each file
(and, in this case, I do!) I can easily calculate and display the total
progress of the overall sync operation (10.9%!) and make rough estimates
as to how long the process as a whole will take based on the
speed/duration of the upload as I continue to monitor it. Perfect!
Moments later, however, I notice something strange: After `file_1`
finishes and `file_2` begins uploading, `file_3` through `file_5`
disappear from the status output altogether; I see only this:
{{{
[{'kind': 'upload',
'path': 'file_2',
'percent_done': 2.875,
'queued_at': 1498095903.754096,
'started_at': 1498096006.468258,
'status': 'started'},
{'kind': 'upload',
'path': 'file_1',
'percent_done': 100.0,
'queued_at': 1498095893.985629,
'started_at': 1498095894.420489,
'status': 'success',
'success_at': 1498096006.467593}]
}}}
If I go by what the status API tells me alone, I might think that those
files have disappeared (and if I hadn't seen the previous output, I
wouldn't even know that `file_3` through `file_5` exist!), however, a
quick look into my local magic-folder directory reveals that they're still
sitting there:
{{{
dev ~ % ls -al Magic-Folder
total 501M
drwxr-xr-x 2 user user 4.0K Jun 21 21:45 .
drwxr-xr-x 64 user user 4.0K Jun 21 22:39 ..
-rw-r--r-- 1 user user 100M Jun 21 21:44 file_1
-rw-r--r-- 1 user user 100M Jun 21 21:45 file_2
-rw-r--r-- 1 user user 100M Jun 21 21:45 file_3
-rw-r--r-- 1 user user 100M Jun 21 21:45 file_4
-rw-r--r-- 1 user user 100M Jun 21 21:45 file_5
}}}
Shortly after, I check again and see that `file_3` is now uploading.
`file_4` and `file_5`, however, are still absent:
{{{
[{'kind': 'upload',
'path': 'file_3',
'percent_done': 16.625,
'queued_at': 1498095913.339961,
'started_at': 1498096121.977303,
'status': 'started'},
{'kind': 'upload',
'path': 'file_2',
'percent_done': 100.0,
'queued_at': 1498095903.754096,
'started_at': 1498096006.468258,
'status': 'success',
'success_at': 1498096121.977229},
{'kind': 'upload',
'path': 'file_1',
'percent_done': 100.0,
'queued_at': 1498095893.985629,
'started_at': 1498095894.420489,
'status': 'success',
'success_at': 1498096006.467593}]
}}}
I wait a few moments again and then I see this:
{{{
{'kind': 'upload',
'path': 'file_2',
'percent_done': 0,
'queued_at': 1498096313.364319,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_1',
'percent_done': 0,
'queued_at': 1498096313.364558,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_4',
'percent_done': 0,
'queued_at': 1498096313.364875,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_3',
'percent_done': 0,
'queued_at': 1498096313.365039,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_4',
'percent_done': 88.75,
'queued_at': 1498095922.907665,
'started_at': 1498096236.681625,
'status': 'started'},
{'kind': 'upload',
'path': 'file_3',
'percent_done': 100.0,
'queued_at': 1498095913.339961,
'started_at': 1498096121.977303,
'status': 'success',
'success_at': 1498096236.681511},
{'kind': 'upload',
'path': 'file_2',
'percent_done': 100.0,
'queued_at': 1498095903.754096,
'started_at': 1498096006.468258,
'status': 'success',
'success_at': 1498096121.977229},
{'kind': 'upload',
'path': 'file_1',
'percent_done': 100.0,
'queued_at': 1498095893.985629,
'started_at': 1498095894.420489,
'status': 'success',
'success_at': 1498096006.467593}]
}}}
For some reason, `file_1` through `file_4` now have ''two'' entries each.
I haven't modified these files since their creation so I'm not sure why
magic-folder suddenly wants to upload them again (''if'' that's what it's
planning on doing) but I assume, for now, that they'll go away later --
maybe after `file_5` re-appears and completes its upload..
Checking later, I see that `file_5` is now back -- and the total operation
progress can be calculated with confidence at 85.1%:
{{{
[{'kind': 'upload',
'path': 'file_2',
'percent_done': 0,
'queued_at': 1498096313.364319,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_1',
'percent_done': 0,
'queued_at': 1498096313.364558,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_4',
'percent_done': 0,
'queued_at': 1498096313.364875,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_3',
'percent_done': 0,
'queued_at': 1498096313.365039,
'status': 'queued'},
{'kind': 'upload',
'path': 'file_5',
'percent_done': 25.5,
'queued_at': 1498095932.553877,
'started_at': 1498096351.210107,
'status': 'started'},
{'kind': 'upload',
'path': 'file_4',
'percent_done': 100.0,
'queued_at': 1498095922.907665,
'started_at': 1498096236.681625,
'status': 'success',
'success_at': 1498096351.209893},
{'kind': 'upload',
'path': 'file_3',
'percent_done': 100.0,
'queued_at': 1498095913.339961,
'started_at': 1498096121.977303,
'status': 'success',
'success_at': 1498096236.681511},
{'kind': 'upload',
'path': 'file_2',
'percent_done': 100.0,
'queued_at': 1498095903.754096,
'started_at': 1498096006.468258,
'status': 'success',
'success_at': 1498096121.977229},
{'kind': 'upload',
'path': 'file_1',
'percent_done': 100.0,
'queued_at': 1498095893.985629,
'started_at': 1498095894.420489,
'status': 'success',
'success_at': 1498096006.467593}]
}}}
For whatever it's worth, I see that the `queued_at` time for my `file_5`
upload is earlier than that of the second `file_1` upload and yet, despite
being temporally prior, it failed to remain present when the second
`file_1` operation appeared. More interestingly, the `queued_at` value for
`file_5` is the same as what it was when I originally saw it (i.e., just
before it disappeared from the JSON output) leading me to believe that
magic-folder "knew" about the `file_5` operation all along but, for
whatever reason, stopped displaying it in the JSON output along the way.
With all that being said, given this behavior, the magic-folder status API
seemingly cannot be relied upon to provide accurate information regarding
future/pending upload operations (while my goal of providing a single
status indicator for a set of upload/download operations seemingly can't
be achieved without resorting to extremely wasteful measures); without
catching that brief early moment when all of the queued files are
faithfully present (i.e., before they disappear from the JSON output),
there appears to be no simple or direct way of knowing which files magic-
folder is planning to upload until that upload actually begins.
Ideally, this would work in a more reliable fashion: when magic-folder's
filesystem monitor detects a local file change, the subsequent upload
operation should be added to whatever queue the status API gathers its
information from and remain there indefinitely. Having entries disappear
and reappear requires that any applications relying on this API perform
additional and ongoing checks to assess the accuracy of the information
being provided (for example, having to check and see whether `file_5`
''actually'' disappeared from the disk).
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2885>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list