#2885 closed defect (fixed)

Magic-folder status API often fails to display queued upload operations

Reported by: cypher Owned by:
Priority: normal Milestone: undecided
Component: unknown Version: 1.12.1
Keywords: magic-folder reliability transparency Cc:
Launchpad Bug:

Description

While working on an application that uses the magic-folder status API (in this case, a monitor module for Gridsync that displays and updates the total progress and time remaining of an in-progress magic-folder sync operation), I've noticed that the status API will often fail to display entries for queued operations and/or drop previously-displayed entries from its output.

To demonstrate what I mean by this, I have a magic-folder already configured and a simple python script that pprint()s the JSON output of the magic-folder status API..

To start, I create five 100M junk files inside the local magic-folder directory (by dding /dev/urandom into file_1, file_2, and so on), simulating the behavior of a user that might dump a handful of large-ish files into a magic-folder.

Later, I send an authenticated POST request to magic-folder status API and get back the following output:

[{'kind': 'upload',
  'path': 'file_2',
  'percent_done': 0,
  'queued_at': 1498095903.754096,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_3',
  'percent_done': 0,
  'queued_at': 1498095913.339961,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_4',
  'percent_done': 0,
  'queued_at': 1498095922.907665,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_5',
  'percent_done': 0,
  'queued_at': 1498095932.553877,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_1',
  'percent_done': 54.37499999999999,
  'queued_at': 1498095893.985629,
  'started_at': 1498095894.420489,
  'status': 'started'}]

This is exactly the output that I'd expect to see: file_1 is just over half-way uploaded and file_2 through file_5 are sitting in the queued state (i.e., pending upload). If I know the filesize of each file (and, in this case, I do!) I can easily calculate and display the total progress of the overall sync operation (10.9%!) and make rough estimates as to how long the process as a whole will take based on the speed/duration of the upload as I continue to monitor it. Perfect!

Moments later, however, I notice something strange: After file_1 finishes and file_2 begins uploading, file_3 through file_5 disappear from the status output altogether; I see only this:

[{'kind': 'upload',
  'path': 'file_2',
  'percent_done': 2.875,
  'queued_at': 1498095903.754096,
  'started_at': 1498096006.468258,
  'status': 'started'},
 {'kind': 'upload',
  'path': 'file_1',
  'percent_done': 100.0,
  'queued_at': 1498095893.985629,
  'started_at': 1498095894.420489,
  'status': 'success',
  'success_at': 1498096006.467593}]

If I go by what the status API tells me alone, I might think that those files have disappeared (and if I hadn't seen the previous output, I wouldn't even know that file_3 through file_5 exist!), however, a quick look into my local magic-folder directory reveals that they're still sitting there:

dev ~ % ls -al Magic-Folder
total 501M
drwxr-xr-x  2 user user 4.0K Jun 21 21:45 .
drwxr-xr-x 64 user user 4.0K Jun 21 22:39 ..
-rw-r--r--  1 user user 100M Jun 21 21:44 file_1
-rw-r--r--  1 user user 100M Jun 21 21:45 file_2
-rw-r--r--  1 user user 100M Jun 21 21:45 file_3
-rw-r--r--  1 user user 100M Jun 21 21:45 file_4
-rw-r--r--  1 user user 100M Jun 21 21:45 file_5

Shortly after, I check again and see that file_3 is now uploading. file_4 and file_5, however, are still absent:

[{'kind': 'upload',
  'path': 'file_3',
  'percent_done': 16.625,
  'queued_at': 1498095913.339961,
  'started_at': 1498096121.977303,
  'status': 'started'},
 {'kind': 'upload',
  'path': 'file_2',
  'percent_done': 100.0,
  'queued_at': 1498095903.754096,
  'started_at': 1498096006.468258,
  'status': 'success',
  'success_at': 1498096121.977229},
 {'kind': 'upload',
  'path': 'file_1',
  'percent_done': 100.0,
  'queued_at': 1498095893.985629,
  'started_at': 1498095894.420489,
  'status': 'success',
  'success_at': 1498096006.467593}]

I wait a few moments again and then I see this:

{'kind': 'upload',
  'path': 'file_2',
  'percent_done': 0,
  'queued_at': 1498096313.364319,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_1',
  'percent_done': 0,
  'queued_at': 1498096313.364558,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_4',
  'percent_done': 0,
  'queued_at': 1498096313.364875,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_3',
  'percent_done': 0,
  'queued_at': 1498096313.365039,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_4',
  'percent_done': 88.75,
  'queued_at': 1498095922.907665,
  'started_at': 1498096236.681625,
  'status': 'started'},
 {'kind': 'upload',
  'path': 'file_3',
  'percent_done': 100.0,
  'queued_at': 1498095913.339961,
  'started_at': 1498096121.977303,
  'status': 'success',
  'success_at': 1498096236.681511},
 {'kind': 'upload',
  'path': 'file_2',
  'percent_done': 100.0,
  'queued_at': 1498095903.754096,
  'started_at': 1498096006.468258,
  'status': 'success',
  'success_at': 1498096121.977229},
 {'kind': 'upload',
  'path': 'file_1',
  'percent_done': 100.0,
  'queued_at': 1498095893.985629,
  'started_at': 1498095894.420489,
  'status': 'success',
  'success_at': 1498096006.467593}]

For some reason, file_1 through file_4 now have two entries each. I haven't modified these files since their creation so I'm not sure why magic-folder suddenly wants to upload them again (if that's what it's planning on doing) but I assume, for now, that they'll go away later -- maybe after file_5 re-appears and completes its upload..

Checking later, I see that file_5 is now back -- and the total operation progress can be calculated with confidence at 85.1%:

[{'kind': 'upload',
  'path': 'file_2',
  'percent_done': 0,
  'queued_at': 1498096313.364319,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_1',
  'percent_done': 0,
  'queued_at': 1498096313.364558,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_4',
  'percent_done': 0,
  'queued_at': 1498096313.364875,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_3',
  'percent_done': 0,
  'queued_at': 1498096313.365039,
  'status': 'queued'},
 {'kind': 'upload',
  'path': 'file_5',
  'percent_done': 25.5,
  'queued_at': 1498095932.553877,
  'started_at': 1498096351.210107,
  'status': 'started'},
 {'kind': 'upload',
  'path': 'file_4',
  'percent_done': 100.0,
  'queued_at': 1498095922.907665,
  'started_at': 1498096236.681625,
  'status': 'success',
  'success_at': 1498096351.209893},
 {'kind': 'upload',
  'path': 'file_3',
  'percent_done': 100.0,
  'queued_at': 1498095913.339961,
  'started_at': 1498096121.977303,
  'status': 'success',
  'success_at': 1498096236.681511},
 {'kind': 'upload',
  'path': 'file_2',
  'percent_done': 100.0,
  'queued_at': 1498095903.754096,
  'started_at': 1498096006.468258,
  'status': 'success',
  'success_at': 1498096121.977229},
 {'kind': 'upload',
  'path': 'file_1',
  'percent_done': 100.0,
  'queued_at': 1498095893.985629,
  'started_at': 1498095894.420489,
  'status': 'success',
  'success_at': 1498096006.467593}]

For whatever it's worth, I see that the queued_at time for my file_5 upload is earlier than that of the second file_1 upload and yet, despite being temporally prior, it failed to remain present when the second file_1 operation appeared. More interestingly, the queued_at value for file_5 is the same as what it was when I originally saw it (i.e., just before it disappeared from the JSON output) leading me to believe that magic-folder "knew" about the file_5 operation all along but, for whatever reason, stopped displaying it in the JSON output along the way.

With all that being said, given this behavior, the magic-folder status API seemingly cannot be relied upon to provide accurate information regarding future/pending upload operations (while my goal of providing a single status indicator for a set of upload/download operations seemingly can't be achieved without resorting to extremely wasteful measures); without catching that brief early moment when all of the queued files are faithfully present (i.e., before they disappear from the JSON output), there appears to be no simple or direct way of knowing which files magic-folder is planning to upload until that upload actually begins.

Ideally, this would work in a more reliable fashion: when magic-folder's filesystem monitor detects a local file change, the subsequent upload operation should be added to whatever queue the status API gathers its information from and remain there indefinitely. Having entries disappear and reappear requires that any applications relying on this API perform additional and ongoing checks to assess the accuracy of the information being provided (for example, having to check and see whether file_5 actually disappeared from the disk).

Change History (5)

comment:1 Changed at 2017-06-22T14:23:51Z by cypher

  • Keywords reliability transparency added

comment:2 Changed at 2018-01-09T19:36:15Z by meejah

Hmm, this might be some bug in the actual magic-folder queueing logic? IIRC, the status API is "just" outputting everything from the queue.

comment:4 Changed at 2018-03-20T22:48:21Z by meejah

  • Resolution set to fixed
  • Status changed from new to closed

comment:5 Changed at 2018-03-20T22:48:29Z by meejah

this is now merged on master

Note: See TracTickets for help on using tickets.