| 1 | .. -*- coding: utf-8-with-signature -*- |
|---|
| 2 | |
|---|
| 3 | ********************** |
|---|
| 4 | The Convergence Secret |
|---|
| 5 | ********************** |
|---|
| 6 | |
|---|
| 7 | What Is It? |
|---|
| 8 | ----------- |
|---|
| 9 | |
|---|
| 10 | The identifier of a file (also called the "capability" to a file) is derived |
|---|
| 11 | from two pieces of information when the file is uploaded: the content of the |
|---|
| 12 | file and the upload client's "convergence secret". By default, the |
|---|
| 13 | convergence secret is randomly generated by the client when it first starts |
|---|
| 14 | up, then stored in the client's base directory (<Tahoe's node |
|---|
| 15 | dir>/private/convergence) and re-used after that. So the same file content |
|---|
| 16 | uploaded from the same client will always have the same cap. Uploading the |
|---|
| 17 | file from a different client with a different convergence secret would result |
|---|
| 18 | in a different cap -- and in a second copy of the file's contents stored on |
|---|
| 19 | the grid. If you want files you upload to converge (also known as |
|---|
| 20 | "deduplicate") with files uploaded by someone else, just make sure you're |
|---|
| 21 | using the same convergence secret when you upload files as them. |
|---|
| 22 | |
|---|
| 23 | The advantages of deduplication should be clear, but keep in mind that the |
|---|
| 24 | convergence secret was created to protect confidentiality. There are two |
|---|
| 25 | attacks that can be used against you by someone who knows the convergence |
|---|
| 26 | secret you use. |
|---|
| 27 | |
|---|
| 28 | The first one is called the "Confirmation-of-a-File Attack". Someone who |
|---|
| 29 | knows the convergence secret that you used when you uploaded a file, and who |
|---|
| 30 | has a copy of that file themselves, can check whether you have a copy of that |
|---|
| 31 | file. This is usually not a problem, but it could be if that file is, for |
|---|
| 32 | example, a book or movie that is banned in your country. |
|---|
| 33 | |
|---|
| 34 | The second attack is more subtle. It is called the |
|---|
| 35 | "Learn-the-Remaining-Information Attack". Suppose you've received a |
|---|
| 36 | confidential document, such as a PDF from your bank which contains many pages |
|---|
| 37 | of boilerplate text as well as containing your bank account number and |
|---|
| 38 | balance. Someone who knows your convergence secret can generate a file with |
|---|
| 39 | all of the boilerplate text (perhaps they would open an account with the same |
|---|
| 40 | bank so they receive the same document with their account number and |
|---|
| 41 | balance). Then they can try a "brute force search" to find your account |
|---|
| 42 | number and your balance. |
|---|
| 43 | |
|---|
| 44 | The defense against these attacks is that only someone who knows the |
|---|
| 45 | convergence secret that you used on each file can perform these attacks on |
|---|
| 46 | that file. |
|---|
| 47 | |
|---|
| 48 | Both of these attacks and the defense are described in more detail in `Drew |
|---|
| 49 | Perttula's Hack Tahoe-LAFS Hall Of Fame entry`_ |
|---|
| 50 | |
|---|
| 51 | .. _`Drew Perttula's Hack Tahoe-LAFS Hall Of Fame entry`: |
|---|
| 52 | https://tahoe-lafs.org/hacktahoelafs/drew_perttula.html |
|---|
| 53 | |
|---|
| 54 | What If I Change My Convergence Secret? |
|---|
| 55 | --------------------------------------- |
|---|
| 56 | |
|---|
| 57 | All your old file capabilities will still work, but the new data that you |
|---|
| 58 | upload will not be deduplicated with the old data. If you upload all of the |
|---|
| 59 | same things to the grid, you will end up using twice the space until garbage |
|---|
| 60 | collection kicks in (if it's enabled). Changing the convergence secret that a |
|---|
| 61 | storage client uses for uploads can be though of as moving the client to a |
|---|
| 62 | new "deduplication domain". |
|---|
| 63 | |
|---|
| 64 | How To Use It |
|---|
| 65 | ------------- |
|---|
| 66 | |
|---|
| 67 | To enable deduplication between different clients, **securely** copy the |
|---|
| 68 | convergence secret file from one client to all the others. |
|---|
| 69 | |
|---|
| 70 | For example, if you are on host A and have an account on host B and you have |
|---|
| 71 | scp installed, run: |
|---|
| 72 | |
|---|
| 73 | *scp ~/.tahoe/private/convergence |
|---|
| 74 | my_other_account@B:.tahoe/private/convergence* |
|---|
| 75 | |
|---|
| 76 | If you have two different clients on a single computer, say one for each |
|---|
| 77 | disk, you would do: |
|---|
| 78 | |
|---|
| 79 | *cp /tahoe1/private/convergence /tahoe2/private/convergence* |
|---|
| 80 | |
|---|
| 81 | After you change the convergence secret file, you must restart the client |
|---|
| 82 | before it will stop using the old one and read the new one from the file. |
|---|