#1865 new defect

backup has over-general exception swallowing that leads to assertion failures.

Reported by: nejucomo Owned by: davidsarah
Priority: normal Milestone: undecided
Component: unknown Version: 1.9.2
Keywords: error tahoe-backup Cc:
Launchpad Bug:

Description

Symptom

I had a long-running tahoe backup command which I suspended. Then I started another tahoe backup and saw this traceback:

Traceback (most recent call last):
  File "/home/n/virtualenvs/default/bin/tahoe", line 8, in <module>
    load_entry_point('allmydata-tahoe==1.9.2', 'console_scripts', 'tahoe')()
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 113, in run
    rc = runner(sys.argv[1:], install_node_control=install_node_control)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 99, in runner
    rc = cli.dispatch[command](so)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/cli.py", line 569, in backup
    rc = tahoe_backup.backup(options)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 325, in backup
    return bu.run()
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 118, in run
    new_backup_dircap = self.process(options.from_dir)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 188, in process
    childcap = self.process(childpath)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 194, in process
    childcap, metadata = self.upload(childpath)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 313, in upload
    bdb_results.did_upload(filecap)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 134, in did_upload
    self.mtime, self.ctime, self.size)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 276, in did_upload_file
    fileid = self.get_or_allocate_fileid_for_cap(filecap)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 270, in get_or_allocate_fileid_for_cap
    assert foundrow
AssertionError

Diagnosis:

After examining the code with much help from Zooko, I tweaked the installed source like this:

--- ./allmydata-tahoe-1.9.2/src/allmydata/scripts/backupdb.py   2012-05-14 03:07:24.000000000 -0700
+++ /home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py       2012-11-17 22:15:15.000000000 -0800
@@ -261,7 +261,8 @@
         c = self.cursor
         try:
             c.execute("INSERT INTO caps (filecap) VALUES (?)", (filecap,))
-        except (self.sqlite_module.IntegrityError, self.sqlite_module.OperationalError):
+        except (self.sqlite_module.IntegrityError, self.sqlite_module.OperationalError), e:
+            print 'HACKY DEBUG HACK: %r %r' % (e, e.args)
             # sqlite3 on sid gives IntegrityError
             # pysqlite2 on dapper gives OperationalError
             pass

I reran the backup command to see this:

HACKY DEBUG HACK: OperationalError('attempt to write a readonly database',) ('attempt to write a readonly database',)
Traceback (most recent call last):
  File "/home/n/virtualenvs/default/bin/tahoe", line 8, in <module>
    load_entry_point('allmydata-tahoe==1.9.2', 'console_scripts', 'tahoe')()
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 113, in run
    rc = runner(sys.argv[1:], install_node_control=install_node_control)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 99, in runner
    rc = cli.dispatch[command](so)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/cli.py", line 569, in backup
    rc = tahoe_backup.backup(options)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 325, in backup
    return bu.run()
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 118, in run
    new_backup_dircap = self.process(options.from_dir)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 188, in process
    childcap = self.process(childpath)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 194, in process
    childcap, metadata = self.upload(childpath)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 313, in upload
    bdb_results.did_upload(filecap)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 134, in did_upload
    self.mtime, self.ctime, self.size)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 277, in did_upload_file
    fileid = self.get_or_allocate_fileid_for_cap(filecap)
  File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 271, in get_or_allocate_fileid_for_cap
    assert foundrow
AssertionError

backupdb's get_or_allocate_fileid_for_cap() is swallowing exceptions with the intent of allowing key collisions to implement the semantics of: "Insert, but if it's already there, that's ok." However, it is actually swallowing a more general set of exceptional conditions. In this case, a lock is held by the suspended backup process so the new process times out.

Resolution:

Zooko suggests this fix:

  1. Ensure the sql commands are in a transaction (See caveat below)
  2. query for the cap to get the fileid
  3. if it is not found:
    1. insert it without swallowing exceptions
    2. query for the cap to get the fileid

Caveat:

The transaction scope of the db backend is important. Zooko did some quick searching to determine sqlite3 is table-scoped which should be safe, but in order to support other databases more research needs to be done to determine if this approach is feasible.

Change History (1)

comment:1 Changed at 2012-11-18T07:06:26Z by zooko

Here's a patch which I think will do the right thing while not catching exceptions that it shouldn't catch. This is safe in sqlite, because sqlite always locks whole tables. For databases with finer-grained transactions, it would be necessary to use isolation level "serializable" or stricter for this to be safe.

diff --git a/src/allmydata/scripts/backupdb.py b/src/allmydata/scripts/backupdb.py
index 75ee0d9..8eed692 100644
--- a/src/allmydata/scripts/backupdb.py
+++ b/src/allmydata/scripts/backupdb.py
@@ -247,14 +247,12 @@ class BackupDB_v2:
         # sqlite has "INSERT ON CONFLICT REPLACE", but not mysql
         # So we use INSERT, ignore any error, then a SELECT
         c = self.cursor
-        try:
-            c.execute("INSERT INTO caps (filecap) VALUES (?)", (filecap,))
-        except (self.sqlite_module.IntegrityError, self.sqlite_module.OperationalError):
-            # sqlite3 on sid gives IntegrityError
-            # pysqlite2 (which we don't use, so maybe no longer relevant) on dapper gives OperationalError
-            pass
         c.execute("SELECT fileid FROM caps WHERE filecap=?", (filecap,))
         foundrow = c.fetchone()
+        if foundrow is None:
+            c.execute("INSERT INTO caps (filecap) VALUES (?)", (filecap,))
+            c.execute("SELECT fileid FROM caps WHERE filecap=?", (filecap,))
+            foundrow = c.fetchone()
         assert foundrow
         fileid = foundrow[0]
         return fileid
Note: See TracTickets for help on using tickets.