Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of KnownIssues

Timestamp:: 2008-06-05T20:02:29Z (17 years ago)
Author:: warner
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

KnownIssues

                       v1
+= Known Issues =
+This page describes known problems for recent releases of Tahoe. Issues are
+fixed as quickly as possible, however users of older releases may still need
+to be aware of these problems until they upgrade to a release which resolves
+it.
+== Issues in [Tahoe 1.1 milestone:1.1.0] (not quite released) ==
+=== Servers which run out of space ===
+If a Tahoe storage server runs out of space, writes will fail with an
+{{{IOError}}} exception. In some situations, Tahoe-1.1 clients will not react
+to this very well:
+ * if the exception occurs during an immutable-share write, that share will
+   be broken. The client will detect this, and will declare the upload as
+   failing if insufficient shares can be placed (this "shares of happiness"
+   threshold defaults to 7 out of 10). The code does not yet search for new
+   servers to replace the full ones. If the upload fails, the server's
+   upload-already-in-progress routines may interfere with a subsequent
+   upload.
+ * if the exception occurs during a mutable-share write, the old share will
+   be left in place (and a new home for the share will be sought). If enough
+   old shares are left around, subsequent reads may see the file in its
+   earlier state, known as a "rollback" fault. Writing a new version of the
+   file should find the newer shares correctly, although it will take
+   longer (more roundtrips) than usual.
+The out-of-space handling code is not yet complete, and we do not yet have a
+space-limiting solution that is suitable for large storage nodes. The
+"sizelimit" configuration uses a /usr/bin/du -style query at node startup,
+which takes a long time (tens of minutes) on storage nodes that offer 100GB
+or more, making it unsuitable for highly-available servers.
+In lieu of 'sizelimit', server admins are advised to set the
+NODEDIR/readonly_storage (and remove 'sizelimit', and restart their nodes) on
+their storage nodes before space is exhausted. This will stop the influx of
+immutable shares. Mutable shares will continue to arrive, but since these are
+mainly used by directories, the amount of space consumed will be smaller.
+Eventually we will have a better solution for this.
+== Issues in Tahoe 1.0 ==
+=== Servers which run out of space ===
+In addition to the problems described above, Tahoe-1.0 clients which
+experience out-of-space errors while writing mutable files are likely to
+think the write succeeded, when it in fact failed. This can cause data loss.
+=== Large Directories or Mutable files in a specific range of sizes ===
+A mismatched pair of size limits causes a problem when a client attempts to
+upload a large mutable file with a size between 3139275 and 3500000 bytes.
+(Mutable files larger than 3.5MB are refused outright). The symptom is very
+high memory usage (3GB) and 100% CPU for about 5 minutes. The attempted write
+will fail, but the client may think that it succeeded. This size corresponds
+to roughly 9000 entries in a directory.
+This was fixed in 1.1, as ticket #379. Files up to 3.5MB should now work
+properly, and files above that size should be rejected properly. Both servers
+and clients must be upgraded to resolve the problem, although once the client
+is upgraded to 1.1 the memory usage and false-success problems should be
+fixed.
+=== pycryptopp compile errors resulting in corruption ===
+Certain combinations of compiler, linker, and pycryptopp versions may cause
+corruption errors during decryption, resulting in corrupted plaintext.