source: trunk/docs/anonymity-configuration.rst

Last change on this file was 20dd4d5, checked in by Jean-Paul Calderone <exarkun@…>, at 2020-12-09T15:50:03Z

Use "tahoe run"

  • Property mode set to 100644
File size: 17.9 KB
Line 
1.. -*- coding: utf-8-with-signature; fill-column: 77 -*-
2
3======================================================
4Using Tahoe-LAFS with an anonymizing network: Tor, I2P
5======================================================
6
7#. `Overview`_
8#. `Use cases`_
9
10#. `Software Dependencies`_
11
12   #. `Tor`_
13   #. `I2P`_
14
15#. `Connection configuration`_
16
17#. `Anonymity configuration`_
18
19   #. `Client anonymity`_
20   #. `Server anonymity, manual configuration`_
21   #. `Server anonymity, automatic configuration`_
22
23#. `Performance and security issues`_
24
25
26
27Overview
28========
29
30Tor is an anonymizing network used to help hide the identity of internet
31clients and servers. Please see the Tor Project's website for more information:
32https://www.torproject.org/
33
34I2P is a decentralized anonymizing network that focuses on end-to-end anonymity
35between clients and servers. Please see the I2P website for more information:
36https://geti2p.net/
37
38
39
40Use cases
41=========
42
43There are three potential use-cases for Tahoe-LAFS on the client side:
44
451. User wishes to always use an anonymizing network (Tor, I2P) to protect
46   their anonymity when connecting to Tahoe-LAFS storage grids (whether or
47   not the storage servers are anonymous).
48
492. User does not care to protect their anonymity but they wish to connect to
50   Tahoe-LAFS storage servers which are accessible only via Tor Hidden Services or I2P.
51
52   * Tor is only used if a server connection hint uses ``tor:``. These hints
53     generally have a ``.onion`` address.
54   * I2P is only used if a server connection hint uses ``i2p:``. These hints
55     generally have a ``.i2p`` address.
56
573. User does not care to protect their anonymity or to connect to anonymous
58   storage servers. This document is not useful to you... so stop reading.
59
60
61For Tahoe-LAFS storage servers there are three use-cases:
62
631. The operator wishes to protect their anonymity by making their Tahoe
64   server accessible only over I2P, via Tor Hidden Services, or both.
65
662. The operator does not *require* anonymity for the storage server, but they
67   want it to be available over both publicly routed TCP/IP and through an
68   anonymizing network (I2P, Tor Hidden Services). One possible reason to do
69   this is because being reachable through an anonymizing network is a
70   convenient way to bypass NAT or firewall that prevents publicly routed
71   TCP/IP connections to your server (for clients capable of connecting to
72   such servers). Another is that making your storage server reachable
73   through an anonymizing network can provide better protection for your
74   clients who themselves use that anonymizing network to protect their
75   anonymity.
76
773. Storage server operator does not care to protect their own anonymity nor
78   to help the clients protect theirs. Stop reading this document and run
79   your Tahoe-LAFS storage server using publicly routed TCP/IP.
80
81
82   See this Tor Project page for more information about Tor Hidden Services:
83   https://www.torproject.org/docs/hidden-services.html.en
84
85   See this I2P Project page for more information about I2P:
86   https://geti2p.net/en/about/intro
87
88
89Software Dependencies
90=====================
91
92Tor
93---
94
95Clients who wish to connect to Tor-based servers must install the following.
96
97* Tor (tor) must be installed. See here:
98  https://www.torproject.org/docs/installguide.html.en . On Debian/Ubuntu,
99  use ``apt-get install tor``. You can also install and run the Tor Browser
100  Bundle.
101
102* Tahoe-LAFS must be installed with the ``[tor]`` "extra" enabled. This will
103  install ``txtorcon`` ::
104
105   pip install tahoe-lafs[tor]
106
107Manually-configured Tor-based servers must install Tor, but do not need
108``txtorcon`` or the ``[tor]`` extra. Automatic configuration, when
109implemented, will need these, just like clients.
110
111I2P
112---
113
114Clients who wish to connect to I2P-based servers must install the following.
115As with Tor, manually-configured I2P-based servers need the I2P daemon, but
116no special Tahoe-side supporting libraries.
117
118* I2P must be installed. See here:
119  https://geti2p.net/en/download
120
121* The SAM API must be enabled.
122
123  * Start I2P.
124  * Visit http://127.0.0.1:7657/configclients in your browser.
125  * Under "Client Configuration", check the "Run at Startup?" box for "SAM
126    application bridge".
127  * Click "Save Client Configuration".
128  * Click the "Start" control for "SAM application bridge", or restart I2P.
129
130* Tahoe-LAFS must be installed with the ``[i2p]`` extra enabled, to get
131  ``txi2p`` ::
132
133   pip install tahoe-lafs[i2p]
134
135Both Tor and I2P
136----------------
137
138Clients who wish to connect to both Tor- and I2P-based servers must install
139all of the above. In particular, Tahoe-LAFS must be installed with both
140extras enabled::
141
142   pip install tahoe-lafs[tor,i2p]
143
144
145
146Connection configuration
147========================
148
149See :ref:`Connection Management` for a description of the ``[tor]`` and
150``[i2p]`` sections of ``tahoe.cfg``. These control how the Tahoe client will
151connect to a Tor/I2P daemon, and thus make connections to Tor/I2P -based
152servers.
153
154The ``[tor]`` and ``[i2p]`` sections only need to be modified to use unusual
155configurations, or to enable automatic server setup.
156
157The default configuration will attempt to contact a local Tor/I2P daemon
158listening on the usual ports (9050/9150 for Tor, 7656 for I2P). As long as
159there is a daemon running on the local host, and the necessary support
160libraries were installed, clients will be able to use Tor-based servers
161without any special configuration.
162
163However note that this default configuration does not improve the client's
164anonymity: normal TCP connections will still be made to any server that
165offers a regular address (it fulfills the second client use case above, not
166the third). To protect their anonymity, users must configure the
167``[connections]`` section as follows::
168
169  [connections]
170  tcp = tor
171
172With this in place, the client will use Tor (instead of an
173IP-address -revealing direct connection) to reach TCP-based servers.
174
175Anonymity configuration
176=======================
177
178Tahoe-LAFS provides a configuration "safety flag" for explicitly stating
179whether or not IP-address privacy is required for a node::
180
181   [node]
182   reveal-IP-address = (boolean, optional)
183
184When ``reveal-IP-address = False``, Tahoe-LAFS will refuse to start if any of
185the configuration options in ``tahoe.cfg`` would reveal the node's network
186location:
187
188* ``[connections] tcp = tor`` is required: otherwise the client would make
189  direct connections to the Introducer, or any TCP-based servers it learns
190  from the Introducer, revealing its IP address to those servers and a
191  network eavesdropper. With this in place, Tahoe-LAFS will only make
192  outgoing connections through a supported anonymizing network.
193
194* ``tub.location`` must either be disabled, or contain safe values. This
195  value is advertised to other nodes via the Introducer: it is how a server
196  advertises it's location so clients can connect to it. In private mode, it
197  is an error to include a ``tcp:`` hint in ``tub.location``. Private mode
198  rejects the default value of ``tub.location`` (when the key is missing
199  entirely), which is ``AUTO``, which uses ``ifconfig`` to guess the node's
200  external IP address, which would reveal it to the server and other clients.
201
202This option is **critical** to preserving the client's anonymity (client
203use-case 3 from `Use cases`_, above). It is also necessary to preserve a
204server's anonymity (server use-case 3).
205
206This flag can be set (to False) by providing the ``--hide-ip`` argument to
207the ``create-node``, ``create-client``, or ``create-introducer`` commands.
208
209Note that the default value of ``reveal-IP-address`` is True, because
210unfortunately hiding the node's IP address requires additional software to be
211installed (as described above), and reduces performance.
212
213Client anonymity
214----------------
215
216To configure a client node for anonymity, ``tahoe.cfg`` **must** contain the
217following configuration flags::
218
219   [node]
220   reveal-IP-address = False
221   tub.port = disabled
222   tub.location = disabled
223
224Once the Tahoe-LAFS node has been restarted, it can be used anonymously (client
225use-case 3).
226
227Server anonymity, manual configuration
228--------------------------------------
229
230To configure a server node to listen on an anonymizing network, we must first
231configure Tor to run an "Onion Service", and route inbound connections to the
232local Tahoe port. Then we configure Tahoe to advertise the ``.onion`` address
233to clients. We also configure Tahoe to not make direct TCP connections.
234
235* Decide on a local listening port number, named PORT. This can be any unused
236  port from about 1024 up to 65535 (depending upon the host's kernel/network
237  config). We will tell Tahoe to listen on this port, and we'll tell Tor to
238  route inbound connections to it.
239* Decide on an external port number, named VIRTPORT. This will be used in the
240  advertised location, and revealed to clients. It can be any number from 1
241  to 65535. It can be the same as PORT, if you like.
242* Decide on a "hidden service directory", usually in ``/var/lib/tor/NAME``.
243  We'll be asking Tor to save the onion-service state here, and Tor will
244  write the ``.onion`` address here after it is generated.
245
246Then, do the following:
247
248* Create the Tahoe server node (with ``tahoe create-node``), but do **not**
249  launch it yet.
250
251* Edit the Tor config file (typically in ``/etc/tor/torrc``). We need to add
252  a section to define the hidden service. If our PORT is 2000, VIRTPORT is
253  3000, and we're using ``/var/lib/tor/tahoe`` as the hidden service
254  directory, the section should look like::
255
256    HiddenServiceDir /var/lib/tor/tahoe
257    HiddenServicePort 3000 127.0.0.1:2000
258
259* Restart Tor, with ``systemctl restart tor``. Wait a few seconds.
260
261* Read the ``hostname`` file in the hidden service directory (e.g.
262  ``/var/lib/tor/tahoe/hostname``). This will be a ``.onion`` address, like
263  ``u33m4y7klhz3b.onion``. Call this ONION.
264
265* Edit ``tahoe.cfg`` to set ``tub.port`` to use
266  ``tcp:PORT:interface=127.0.0.1``, and ``tub.location`` to use
267  ``tor:ONION.onion:VIRTPORT``. Using the examples above, this would be::
268
269    [node]
270    reveal-IP-address = false
271    tub.port = tcp:2000:interface=127.0.0.1
272    tub.location = tor:u33m4y7klhz3b.onion:3000
273    [connections]
274    tcp = tor
275
276* Launch the Tahoe server with ``tahoe run $NODEDIR``
277
278The ``tub.port`` section will cause the Tahoe server to listen on PORT, but
279bind the listening socket to the loopback interface, which is not reachable
280from the outside world (but *is* reachable by the local Tor daemon). Then the
281``tcp = tor`` section causes Tahoe to use Tor when connecting to the
282Introducer, hiding it's IP address. The node will then announce itself to all
283clients using ``tub.location``, so clients will know that they must use Tor
284to reach this server (and not revealing it's IP address through the
285announcement). When clients connect to the onion address, their packets will
286flow through the anonymizing network and eventually land on the local Tor
287daemon, which will then make a connection to PORT on localhost, which is
288where Tahoe is listening for connections.
289
290Follow a similar process to build a Tahoe server that listens on I2P. The
291same process can be used to listen on both Tor and I2P (``tub.location =
292tor:ONION.onion:VIRTPORT,i2p:ADDR.i2p``). It can also listen on both Tor and
293plain TCP (use-case 2), with ``tub.port = tcp:PORT``, ``tub.location =
294tcp:HOST:PORT,tor:ONION.onion:VIRTPORT``, and ``anonymous = false`` (and omit
295the ``tcp = tor`` setting, as the address is already being broadcast through
296the location announcement).
297
298
299Server anonymity, automatic configuration
300-----------------------------------------
301
302To configure a server node to listen on an anonymizing network, create the
303node with the ``--listen=tor`` option. This requires a Tor configuration that
304either launches a new Tor daemon, or has access to the Tor control port (and
305enough authority to create a new onion service). On Debian/Ubuntu systems, do
306``apt install tor``, add yourself to the control group with ``adduser
307YOURUSERNAME debian-tor``, and then logout and log back in: if the ``groups``
308command includes ``debian-tor`` in the output, you should have permission to
309use the unix-domain control port at ``/var/run/tor/control``.
310
311This option will set ``reveal-IP-address = False`` and ``[connections] tcp =
312tor``. It will allocate the necessary ports, instruct Tor to create the onion
313service (saving the private key somewhere inside NODEDIR/private/), obtain
314the ``.onion`` address, and populate ``tub.port`` and ``tub.location``
315correctly.
316
317
318Performance and security issues
319===============================
320
321If you are running a server which does not itself need to be
322anonymous, should you make it reachable via an anonymizing network or
323not? Or should you make it reachable *both* via an anonymizing network
324and as a publicly traceable TCP/IP server?
325
326There are several trade-offs effected by this decision.
327
328NAT/Firewall penetration
329------------------------
330
331Making a server be reachable via Tor or I2P makes it reachable (by
332Tor/I2P-capable clients) even if there are NATs or firewalls preventing
333direct TCP/IP connections to the server.
334
335Anonymity
336---------
337
338Making a Tahoe-LAFS server accessible *only* via Tor or I2P can be used to
339guarantee that the Tahoe-LAFS clients use Tor or I2P to connect
340(specifically, the server should only advertise Tor/I2P addresses in the
341``tub.location`` config key). This prevents misconfigured clients from
342accidentally de-anonymizing themselves by connecting to your server through
343the traceable Internet.
344
345Clearly, a server which is available as both a Tor/I2P service *and* a
346regular TCP address is not itself anonymous: the .onion address and the real
347IP address of the server are easily linkable.
348
349Also, interaction, through Tor, with a Tor Hidden Service may be more
350protected from network traffic analysis than interaction, through Tor,
351with a publicly traceable TCP/IP server.
352
353**XXX is there a document maintained by Tor developers which substantiates or refutes this belief?
354If so we need to link to it. If not, then maybe we should explain more here why we think this?**
355
356Linkability
357-----------
358
359As of 1.12.0, the node uses a single persistent Tub key for outbound
360connections to the Introducer, and inbound connections to the Storage Server
361(and Helper). For clients, a new Tub key is created for each storage server
362we learn about, and these keys are *not* persisted (so they will change each
363time the client reboots).
364
365Clients traversing directories (from rootcap to subdirectory to filecap) are
366likely to request the same storage-indices (SIs) in the same order each time.
367A client connected to multiple servers will ask them all for the same SI at
368about the same time. And two clients which are sharing files or directories
369will visit the same SIs (at various times).
370
371As a result, the following things are linkable, even with ``reveal-IP-address
372= false``:
373
374* Storage servers can link recognize multiple connections from the same
375  not-yet-rebooted client. (Note that the upcoming Accounting feature may
376  cause clients to present a persistent client-side public key when
377  connecting, which will be a much stronger linkage).
378* Storage servers can probably deduce which client is accessing data, by
379  looking at the SIs being requested. Multiple servers can collude to
380  determine that the same client is talking to all of them, even though the
381  TubIDs are different for each connection.
382* Storage servers can deduce when two different clients are sharing data.
383* The Introducer could deliver different server information to each
384  subscribed client, to partition clients into distinct sets according to
385  which server connections they eventually make. For client+server nodes, it
386  can also correlate the server announcement with the deduced client
387  identity.
388
389Performance
390-----------
391
392A client connecting to a publicly traceable Tahoe-LAFS server through Tor
393incurs substantially higher latency and sometimes worse throughput than the
394same client connecting to the same server over a normal traceable TCP/IP
395connection. When the server is on a Tor Hidden Service, it incurs even more
396latency, and possibly even worse throughput.
397
398Connecting to Tahoe-LAFS servers which are I2P servers incurs higher latency
399and worse throughput too.
400
401Positive and negative effects on other Tor users
402------------------------------------------------
403
404Sending your Tahoe-LAFS traffic over Tor adds cover traffic for other
405Tor users who are also transmitting bulk data. So that is good for
406them -- increasing their anonymity.
407
408However, it makes the performance of other Tor users' interactive
409sessions -- e.g. ssh sessions -- much worse. This is because Tor
410doesn't currently have any prioritization or quality-of-service
411features, so someone else's ssh keystrokes may have to wait in line
412while your bulk file contents get transmitted. The added delay might
413make other people's interactive sessions unusable.
414
415Both of these effects are doubled if you upload or download files to a
416Tor Hidden Service, as compared to if you upload or download files
417over Tor to a publicly traceable TCP/IP server.
418
419Positive and negative effects on other I2P users
420------------------------------------------------
421
422Sending your Tahoe-LAFS traffic over I2P adds cover traffic for other I2P users
423who are also transmitting data. So that is good for them -- increasing their
424anonymity. It will not directly impair the performance of other I2P users'
425interactive sessions, because the I2P network has several congestion control and
426quality-of-service features, such as prioritizing smaller packets.
427
428However, if many users are sending Tahoe-LAFS traffic over I2P, and do not have
429their I2P routers configured to participate in much traffic, then the I2P
430network as a whole will suffer degradation. Each Tahoe-LAFS router using I2P has
431their own anonymizing tunnels that their data is sent through. On average, one
432Tahoe-LAFS node requires 12 other I2P routers to participate in their tunnels.
433
434It is therefore important that your I2P router is sharing bandwidth with other
435routers, so that you can give back as you use I2P. This will never impair the
436performance of your Tahoe-LAFS node, because your I2P router will always
437prioritize your own traffic.
Note: See TracBrowser for help on using the repository browser.