Opened at 2012-11-18T06:42:54Z
Last modified at 2012-11-18T07:06:26Z
#1865 new defect
backup has over-general exception swallowing that leads to assertion failures.
| Reported by: | nejucomo | Owned by: | davidsarah |
|---|---|---|---|
| Priority: | normal | Milestone: | undecided |
| Component: | unknown | Version: | 1.9.2 |
| Keywords: | error tahoe-backup | Cc: | |
| Launchpad Bug: |
Description
Symptom
I had a long-running tahoe backup command which I suspended. Then I started another tahoe backup and saw this traceback:
Traceback (most recent call last):
File "/home/n/virtualenvs/default/bin/tahoe", line 8, in <module>
load_entry_point('allmydata-tahoe==1.9.2', 'console_scripts', 'tahoe')()
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 113, in run
rc = runner(sys.argv[1:], install_node_control=install_node_control)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 99, in runner
rc = cli.dispatch[command](so)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/cli.py", line 569, in backup
rc = tahoe_backup.backup(options)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 325, in backup
return bu.run()
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 118, in run
new_backup_dircap = self.process(options.from_dir)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 188, in process
childcap = self.process(childpath)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 194, in process
childcap, metadata = self.upload(childpath)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 313, in upload
bdb_results.did_upload(filecap)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 134, in did_upload
self.mtime, self.ctime, self.size)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 276, in did_upload_file
fileid = self.get_or_allocate_fileid_for_cap(filecap)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 270, in get_or_allocate_fileid_for_cap
assert foundrow
AssertionError
Diagnosis:
After examining the code with much help from Zooko, I tweaked the installed source like this:
--- ./allmydata-tahoe-1.9.2/src/allmydata/scripts/backupdb.py 2012-05-14 03:07:24.000000000 -0700
+++ /home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py 2012-11-17 22:15:15.000000000 -0800
@@ -261,7 +261,8 @@
c = self.cursor
try:
c.execute("INSERT INTO caps (filecap) VALUES (?)", (filecap,))
- except (self.sqlite_module.IntegrityError, self.sqlite_module.OperationalError):
+ except (self.sqlite_module.IntegrityError, self.sqlite_module.OperationalError), e:
+ print 'HACKY DEBUG HACK: %r %r' % (e, e.args)
# sqlite3 on sid gives IntegrityError
# pysqlite2 on dapper gives OperationalError
pass
I reran the backup command to see this:
HACKY DEBUG HACK: OperationalError('attempt to write a readonly database',) ('attempt to write a readonly database',)
Traceback (most recent call last):
File "/home/n/virtualenvs/default/bin/tahoe", line 8, in <module>
load_entry_point('allmydata-tahoe==1.9.2', 'console_scripts', 'tahoe')()
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 113, in run
rc = runner(sys.argv[1:], install_node_control=install_node_control)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/runner.py", line 99, in runner
rc = cli.dispatch[command](so)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/cli.py", line 569, in backup
rc = tahoe_backup.backup(options)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 325, in backup
return bu.run()
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 118, in run
new_backup_dircap = self.process(options.from_dir)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 188, in process
childcap = self.process(childpath)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 194, in process
childcap, metadata = self.upload(childpath)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/tahoe_backup.py", line 313, in upload
bdb_results.did_upload(filecap)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 134, in did_upload
self.mtime, self.ctime, self.size)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 277, in did_upload_file
fileid = self.get_or_allocate_fileid_for_cap(filecap)
File "/home/n/virtualenvs/default/lib/python2.7/site-packages/allmydata/scripts/backupdb.py", line 271, in get_or_allocate_fileid_for_cap
assert foundrow
AssertionError
backupdb's get_or_allocate_fileid_for_cap() is swallowing exceptions with the intent of allowing key collisions to implement the semantics of: "Insert, but if it's already there, that's ok." However, it is actually swallowing a more general set of exceptional conditions. In this case, a lock is held by the suspended backup process so the new process times out.
Resolution:
Zooko suggests this fix:
- Ensure the sql commands are in a transaction (See caveat below)
- query for the cap to get the fileid
- if it is not found:
- insert it without swallowing exceptions
- query for the cap to get the fileid
Caveat:
The transaction scope of the db backend is important. Zooko did some quick searching to determine sqlite3 is table-scoped which should be safe, but in order to support other databases more research needs to be done to determine if this approach is feasible.

Here's a patch which I think will do the right thing while not catching exceptions that it shouldn't catch. This is safe in sqlite, because sqlite always locks whole tables. For databases with finer-grained transactions, it would be necessary to use isolation level "serializable" or stricter for this to be safe.
diff --git a/src/allmydata/scripts/backupdb.py b/src/allmydata/scripts/backupdb.py index 75ee0d9..8eed692 100644 --- a/src/allmydata/scripts/backupdb.py +++ b/src/allmydata/scripts/backupdb.py @@ -247,14 +247,12 @@ class BackupDB_v2: # sqlite has "INSERT ON CONFLICT REPLACE", but not mysql # So we use INSERT, ignore any error, then a SELECT c = self.cursor - try: - c.execute("INSERT INTO caps (filecap) VALUES (?)", (filecap,)) - except (self.sqlite_module.IntegrityError, self.sqlite_module.OperationalError): - # sqlite3 on sid gives IntegrityError - # pysqlite2 (which we don't use, so maybe no longer relevant) on dapper gives OperationalError - pass c.execute("SELECT fileid FROM caps WHERE filecap=?", (filecap,)) foundrow = c.fetchone() + if foundrow is None: + c.execute("INSERT INTO caps (filecap) VALUES (?)", (filecap,)) + c.execute("SELECT fileid FROM caps WHERE filecap=?", (filecap,)) + foundrow = c.fetchone() assert foundrow fileid = foundrow[0] return fileid