#3888 closed defect (fixed)

Handling Tor and i2p in NURLs

Reported by: itamarst Owned by:
Priority: normal Milestone: HTTP Storage Protocol
Component: unknown Version: n/a
Keywords: Cc:
Launchpad Bug:

Description (last modified by itamarst)

NURLs, the new furl replacement that is HTTP based, does not currently support Tor or i2p syntactically. It just has a hostname field... The hyperlink library also has issues with parsing it if is forced in to the hostname, which may be legitimate "this is not a valid URI/URL" complaint.

One alternative is to switch to combo protocol URLs, e.g.:

pb://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@127.1:34399/jhjbc3bjbhk#v=1
pb+tor://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@127.1:34399/jhjbc3bjbhk#v=1
pb+i2p://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@sdsdfsdfds/jhjbc3bjbhk#v=1

Changing the NURL to this could be backwards compatible if pb:// is synonym for normal HTTPS, as above.

Change History (12)

comment:1 Changed at 2022-04-06T15:40:54Z by itamarst

  • Description modified (diff)

comment:2 Changed at 2022-04-06T16:49:29Z by itamarst

Worth breaking out specific cases:

  • You can have a server listening on public Internet. Client uses Tor. In this case... normal NURL works, insofar as you just make sure to use the Tor proxy. But in practice something that understands Tahoe semantics would provide better anonymity, so the use case is "connect to this normal NURL, but use Tor".
  • You can have a server listening on a Tor private address, which ends with ".onion" or something. In this case maybe normal NURL works too, you could identify it's a special Tor address from teh TLD.
  • As in second Tor use case, I2P addresses might also be identifiable via ".i2p".

comment:3 Changed at 2022-04-06T17:37:27Z by meejah

I'm not sure that there's a problem with Tor here; onion services are on syntactically-valid .onion URLs with a well-known TLD of ".onion" (RFC 7686) which are "just" host-names.

Of course, you do need something that understands what to do with those (which in practice means "using SOCKS5 to the local Tor daemon"). For our case here, txtorcon understands all this and provides "IStreamClientEndpoint" or "IAgent" type interfaces.

comment:4 Changed at 2022-04-06T17:43:34Z by exarkun

The high-level question to answer is:

Given a string containing a NURL, how does the GBS client know how to set up the connection?

The answer implied by the ticket description is to look at the scheme and for "pb" use HTTP over TLS, for "pb+tor" use HTTP over Tor, and for "pb+i2p" use HTTP over I2P.

I'm not sure that there's a problem with Tor here; onion services are on syntactically-valid .onion URLs with a well-known TLD of ".onion" (RFC 7686) which are "just" host-names.

It sounds like this means there might be an alternate answer possible:

Look at the scheme. If it is "pb" and the hostname does not end with ".onion" then use HTTP over TLS. If it is "pb" and the hostname does end with ".onion" then use HTTP over Tor. If it is "pb+i2p" then use HTTP over I2P.

Did I understand the Tor interaction correctly?

If so, I still might prefer the "pb+tor" scheme because of the symmetry it maintains with the other two transports to be supported.

comment:5 Changed at 2022-04-06T17:49:51Z by meejah

Yeah, "pb+tor" might be better because it also covers the case where you want to use Tor to a clearnet listener. And it's more explicit.

That said, though, "use tor for client-operations" is more of a client decision and doesn't necessarily need to be burned into the NURLS like that -- the only time you _need_ to use Tor as a client is when the listener is on a .onion

There's a _lot_ of other stuff to consider regarding "internet location privacy" so having e.g. a Tahoe option like "--location-privacy" would be more how I'd handle the "tor to clearnet services" situation, because there are other things you'd probably want to put over Tor and e.g. make sure you're not doing local DNS lookups or similar.

comment:6 Changed at 2022-04-06T18:01:02Z by meejah

BTW, the tor example should look like pb+tor://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@fjblvrw2jrxnhtg67qpbzi45r7ofojaoo3orzykesly2j3c2m3htapid.onion:34399/jhjbc3bjbhk#v=1

comment:7 Changed at 2022-04-06T18:18:02Z by exarkun

Summarizing some further discussion from IRC ...

There is existing support for having a lot of control over how connections are made. This is in the [connections] section of tahoe.cfg and there's docs for it. Essentially, the GBS client should be prepared to use Tor even for a "pb://..." NURL because maybe the client is configured to prefer privacy-enabling connections even when they are not required to reach a server.

There is other configuration required to be a Tor client but it does not belong in NURLs. It covers, eg, the address of a Tor-ifying SOCKS server to which to connect. This configuration is already supported in the [tor] section of tahoe.cfg and this is how the GBS client can figure out how to create a Tor client endpoint (or IAgent) to use.

Foolscap's I2P support is somewhat more featureful than the pb+i2p://... example given here. It accepts mostly-arbitrary key/value pairs in the connection hints. There's room in the pb+i2p scheme though - for example, in the query arguments. eg pb+i2p://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@sdsdfsdfds/jhjbc3bjbhk#v=1&foo=bar

comment:8 Changed at 2022-06-10T19:33:31Z by meejah

I think this all leads me to conclude that we _shouldn't_ have pb+tor: scheme. There should just be a pb:// scheme and the other configuration tells a client whether to use Tor or not.

If the NURL ends in the well-defined TLD .onion then the client has to decide whether to use that NURL at all or not. If it has Tor configuration, then it can attempt to use it. If it does not, it should not attempt to use it.

comment:9 Changed at 2022-07-25T14:19:17Z by itamarst

More notes: the way we've decided to implement listening (#3902) is by sharing the Foolscap TCP port, and therefore the Foolscap listening and configuration-parsing logic, specifically tub.port and tub.location.

For the moment, then, the actual logic for _listening_ on these ports is implemented by Foolscap, there is no need for HTTP-specific code. However, we do need custom NURL generation for Tor (the .onion domain) and I2P. This can be done post-#3902, and sounds like we decided on pb:// for Tor and pb+i2p:// for I2P. Even though the Tor endpoint has same URL structure as normal endpoint, it still needs custom code to generate it.

In the long term we will need to reimplement the listening logic, but that might take a while due to need for backwards compatibility.

On the _client_ side we need two things:

  1. At minimum, HTTP client code that understands .onion URLs (#3910).
  2. An HTTP-specific Tor policy for additional privacy, specifically to address timing analysis attacks, which would work both with .onion and public servers. For example, the Tor Browser has a bunch of policies about how it routes to different domains and to a specific domain to reduce this risk. This is probably a long-term thing (#3911).

comment:10 Changed at 2022-09-15T14:08:47Z by itamarst

So actually the spec for NURLs has a proposal; regardless of what gets chosen (spec or above), we should make sure everything is in sync!

https://github.com/tahoe-lafs/tahoe-lafs/blob/master/docs/specifications/url.rst

comment:11 Changed at 2023-05-12T13:17:49Z by itamarst

Re-reading:

  • Tor support in NURL scheme is not required. The client needs to detect .onion, yes, but that doesn't require a change in the URL scheme.
  • I2P does have extra options, but those are options the _client_ will want to decide, it's not related to the NURL the server puts out. So as long as no one makes a real conflicting .i2p TLD then you can do same thing as Tor.

So real issue is maybe "are we doing right thing in HTTP client when given .onion and .i2p domains". And custom URLs are nice hint but maybe not actually required.

Last edited at 2023-05-12T13:18:10Z by itamarst (previous) (diff)

comment:12 Changed at 2023-07-05T14:18:30Z by itamarst

  • Resolution set to fixed
  • Status changed from new to closed

Going to close this since much of the work is either done (Tor) or split off into I2P-specific tickets (#4037).

Note: See TracTickets for help on using tickets.