| 1 | | = Known Issues = |
| 2 | | |
| 3 | | Below is a list of known issues in recent releases of Tahoe, and how to manage |
| 4 | | them. |
| 5 | | |
| 6 | | |
| 7 | | == issues in Tahoe v1.1.0, released 2008-06-10 == |
| 8 | | |
| 9 | | === issue 1: server out of space when writing mutable file === |
| 10 | | |
| 11 | | If a v1.0 or v1.1.0 storage server runs out of disk space then its attempts to |
| 12 | | write data to the local filesystem will fail. For immutable files, this will |
| 13 | | not lead to any problem (the attempt to upload that share to that server will |
| 14 | | fail, the partially uploaded share will be deleted from the storage server's |
| 15 | | "incoming shares" directory, and the client will move on to using another |
| 16 | | storage server instead). |
| 17 | | |
| 18 | | If the write was an attempt to modify an existing mutable file, however, a |
| 19 | | problem will result: when the attempt to write the new share fails due to |
| 20 | | insufficient disk space, then it will be aborted and the old share will be left |
| 21 | | in place. If enough such old shares are left, then a subsequent read may get |
| 22 | | those old shares and see the file in its earlier state, which is a "rollback" |
| 23 | | failure. With the default parameters (3-of-10), six old shares will be enough |
| 24 | | to potentially lead to a rollback failure. |
| 25 | | |
| 26 | | ==== how to manage it ==== |
| 27 | | |
| 28 | | Make sure your Tahoe storage servers don't run out of disk space. This means |
| 29 | | refusing storage requests before the disk fills up. There are a couple of ways |
| 30 | | to do that with v1.1. |
| 31 | | |
| 32 | | First, there is a configuration option named "sizelimit" which will cause the |
| 33 | | storage server to do a "du" style recursive examination of its directories at |
| 34 | | startup, and then if the sum of the size of files found therein is greater than |
| 35 | | the "sizelimit" number, it will reject requests by clients to write new |
| 36 | | immutable shares. |
| 37 | | |
| 38 | | However, that can take a long time (something on the order of a minute of |
| 39 | | examination of the filesystem for each 10 GB of data stored in the Tahoe |
| 40 | | server), and the Tahoe server will be unavailable to clients during that time. |
| 41 | | |
| 42 | | Another option is to set the "readonly_storage" configuration option on the |
| 43 | | storage server before startup. This will cause the storage server to reject |
| 44 | | all requests to upload new immutable shares. |
| 45 | | |
| 46 | | Note that neither of these configurations affect mutable shares: even if |
| 47 | | sizelimit is configured and the storage server currently has greater space used |
| 48 | | than allowed, or even if readonly_storage is configured, servers will continue |
| 49 | | to accept new mutable shares and will continue to accept requests to overwrite |
| 50 | | existing mutable shares. |
| 51 | | |
| 52 | | Mutable files are typically used only for directories, and are usually much |
| 53 | | smaller than immutable files, so if you use one of these configurations to stop |
| 54 | | the influx of immutable files while there is still sufficient disk space to |
| 55 | | receive an influx of (much smaller) mutable files, you may be able to avoid the |
| 56 | | potential for "rollback" failure. |
| 57 | | |
| 58 | | A future version of Tahoe will include a fix for this issue. Here is |
| 59 | | [http://allmydata.org/pipermail/tahoe-dev/2008-May/000630.html the mailing list |
| 60 | | discussion] about how that future version will work. |
| 61 | | |
| 62 | | |
| 63 | | == issues in Tahoe v1.1.0 and v1.0.0 == |
| 64 | | |
| 65 | | === issue 2: pyOpenSSL and/or Twisted defect resulting false alarms in the unit tests === |
| 66 | | |
| 67 | | The combination of Twisted v8.1.0 and pyOpenSSL v0.7 causes the Tahoe v1.1 unit |
| 68 | | tests to fail, even though the behavior of Tahoe itself which is being tested is |
| 69 | | correct. |
| 70 | | |
| 71 | | ==== how to manage it ==== |
| 72 | | |
| 73 | | If you are using Twisted v8.1.0 and pyOpenSSL v0.7, then please ignore XYZ in |
| 74 | | XYZ. Downgrading to an older version of Twisted or pyOpenSSL will cause those |
| 75 | | false alarms to stop happening. |
| 76 | | |
| 77 | | |
| 78 | | == issues in Tahoe v1.0.0, released 2008-03-25 == |
| 79 | | |
| 80 | | (Tahoe v1.0 was superceded by v1.1 which was released 2008-06-10.) |
| 81 | | |
| 82 | | === issue 3: server out of space when writing mutable file === |
| 83 | | |
| 84 | | In addition to the problems caused by insufficient disk space described above, |
| 85 | | v1.0 clients which are writing mutable files when the servers fail to write to |
| 86 | | their filesystem are likely to think the write succeeded, when it in fact |
| 87 | | failed. This can cause data loss. |
| 88 | | |
| 89 | | ==== how to manage it ==== |
| 90 | | |
| 91 | | Upgrade client to v1.1, or make sure that servers are always able to write to |
| 92 | | their local filesystem (including that there is space available) as described in |
| 93 | | "issue 1" above. |
| 94 | | |
| 95 | | |
| 96 | | === issue 4: server out of space when writing immutable file === |
| 97 | | |
| 98 | | Tahoe v1.0 clients are using v1.0 servers which are unable to write to their |
| 99 | | filesystem during an immutable upload will correctly detect the first failure, |
| 100 | | but if they retry the upload without restarting the client, or if another client |
| 101 | | attempts to upload the same file, the second upload may appear to succeed when |
| 102 | | it hasn't, which can lead to data loss. |
| 103 | | |
| 104 | | ==== how to manage it ==== |
| 105 | | |
| 106 | | Upgrading either or both of the client and the server to v1.1 will fix this |
| 107 | | issue. Also it can be avoided by ensuring that the servers are always able to |
| 108 | | write to their local filesystem (including that there is space available) as |
| 109 | | described in "issue 1" above. |
| 110 | | |
| 111 | | |
| 112 | | === issue 5: large directories or mutable files in a specific range of sizes === |
| 113 | | |
| 114 | | If a client attempts to upload a large mutable file with a size greater than |
| 115 | | about 3,139,000 and less than or equal to 3,500,000 bytes then it will fail but |
| 116 | | appear to succeed, which can lead to data loss. |
| 117 | | |
| 118 | | (Mutable files larger than 3,500,000 are refused outright). The symptom of the |
| 119 | | failure is very high memory usage (3 GB of memory) and 100% CPU for about 5 |
| 120 | | minutes, before it appears to succeed, although it hasn't. |
| 121 | | |
| 122 | | Directories are stored in mutable files, and a directory of approximately 9000 |
| 123 | | entries may fall into this range of mutable file sizes (depending on the size of |
| 124 | | the filenames or other metadata associated with the entries). |
| 125 | | |
| 126 | | ==== how to manage it ==== |
| 127 | | |
| 128 | | This was fixed in v1.1, under ticket #379. If the client is upgraded to v1.1, |
| 129 | | then it will fail cleanly instead of falsely appearing to succeed when it tries |
| 130 | | to write a file whose size is in this range. If the server is also upgraded to |
| 131 | | v1.1, then writes of mutable files whose size is in this range will succeed. |
| 132 | | (If the server is upgraded to v1.1 but the client is still v1.0 then the client |
| 133 | | will still suffer this failure.) |
| 134 | | |
| 135 | | |
| 136 | | === issue 6: pycryptopp defect resulting in data corruption === |
| 137 | | |
| 138 | | Versions of pycryptopp earlier than pycryptopp-0.5.0 had a defect which, when |
| 139 | | compiled with some compilers, would cause AES-256 encryption and decryption to |
| 140 | | be computed incorrectly. This could cause data corruption. Tahoe v1.0 |
| 141 | | required, and came with a bundled copy of, pycryptopp v0.3. |
| 142 | | |
| 143 | | ==== how to manage it ==== |
| 144 | | |
| 145 | | You can detect whether pycryptopp-0.3 has this failure when it is compiled by |
| 146 | | your compiler. Run the unit tests that come with pycryptopp-0.3: unpack the |
| 147 | | "pycryptopp-0.3.tar" file that comes in the Tahoe v1.0 {{{misc/dependencies}}} |
| 148 | | directory, cd into the resulting {{{pycryptopp-0.3.0}}} directory, and execute |
| 149 | | {{{python ./setup.py test}}}. If the tests pass, then your compiler does not |
| 150 | | trigger this failure. |
| 151 | | |
| 152 | | Tahoe v1.1 requires, and comes with a bundled copy of, pycryptopp v0.5.1, which |
| 153 | | does not have this defect. |
| | 1 | Please see [source:docs/known_issues.txt]. |