[tahoe-dev] Using Pagekite with Tahoe-LAFS
Andrew Miller
amiller at cs.ucf.edu
Sun Jul 10 19:44:27 PDT 2011
Here's an explanation of how I want to use Pagekite [1] with
Tahoe-LAFS. No modification is needed to either Tahoe or Pagekite. I
hope in this thread we can determine if it's a safe plan!
Pagekite is a tool for routing HTTP (and other) traffic through a
public proxy (a frontend), and for offering such a proxy service to
others. I'm proposing to use one or more Pagekite frontends as proxies
for Tahoe storage nodes. The benefits are:
1. It could make participating in a Tahoe grid viable for some
users (behind a NAT, no SSH access to a public server, Freedombox
folks)
2. It helps diversify the risk of storage nodes going unavailable
because of a domain name expiring or a single tunnel proxy failing
3. Allows an inexpensive but untrustworthy public proxy (shared
hosting) to be used with cheap storage but no-public-IP (server in
your apartment)
Motivating example
==============
Bob buys a Freedombox and wants to use it as a Tahoe-LAFS storage
node (for now, just to participate on the public test grid).
Unfortunately, Bob is behind a firewall/NAT that he has no chance of
reconfiguring. Bob doesn't have SSH to a machine anywhere else where
he can open a port, otherwise he would just make a tunnel.
Alice supports the freedombox community by offering a public Pagekite
frontend service. Bob registers his tahoe node with the following
command (simplified syntax, some detail omitted *):
python pagekite.py --frontend=alicefrontend.com
--backend=http:tahoebob.alicefrontend.com:localhost:49293
and then he configures his ~/.tahoe/tahoe.cfg with:
tub.port = 49293
tub.location = tahoebob.alicefrontend.com:80
Bob has to rely on Alice to guarantee QoS (availability, bandwidth),
but otherwise the provider independent security claims** of Tahoe are
left intact: Alice can't snoop or corrupt the files that Bob (or
anyone else) uploads to the grid, since Tahoe's network traffic is
encrypted end-to-end.
If there are many freedombox supporters like Alice, Bob can easily
run multiple pagekites (or add to a pagekite.rc) and then list all of
them in his FURL, e.g.:
tub.location =
tahoebob.alicefrontend.com:80,tahoebob.carolfrontend.com:80,tahoebob.daisyfrontend.com:80
This way, Bob's storage node will be able to participate in the grid
even if some of the frontends break, discontinue service, or forget to
pay their domain name bills. Bob may also choose to pay a frontend in
exchange for the bandwidth - such paid services would have to compete
only on price and QoS rather than 'trustiness', since none of them
would have the capacity to eavesdrop anyway.
How Pagekite works
========================
Pagekite is similar to SSH reverse tunneling. The main difference is
that a frontend can serve multiple backends from a single public IP
and port, instead of one-port-per-backend. Pagekite decides how to
route each incoming connection by inspecting the initial request
itself, which means it's restricted to protocols that carry along a
hostname in the first communication, for example HTTP, or
TLS-with-SNI.
In particular, this works for foolscap (Tahoe's underlying protocol)
because foolscap begins with a plaintext HTTP request and later
Upgrades to TLS. [2]
When Pagekite is used to transfer end-to-end encrypted traffic, like
HTTPS, the security of the communication relies on the underlying
protocol (did you verify the certificate, to avoid man-in-the-middle
attacks?). Tahoe nodes use foolscap, for which the FURL itself
contains a hash of the SSL certificate. [3]
For these reasons, I like to think of Pagekite as a Least Authority
Frontend Service***. It's meant to be user-friendly, both for Bob to
setup his backend, and also for Alice to set up a frontend offered to
the public, without either of them relying on each other's honesty.
Bjarni Runar Einarsson (creator of Pagekite) operates Pagekite.net,
the reference 'paid service'. I hope that in the future there are many
competing services, perhaps offering bandwidth in exchange for
Bitcoin. These services are likely to appeal to the Freedombox crowd,
who will be eager to exchange bandwidth and resources among peers,
especially if they can do so without risking their security. Foolscap
works great in this context as well, since a FURL can contain multiple
hostnames which act as failovers. [3]
My Experiment
===========
I already run a pagekite frontend on a cheap $5/month VPS that has a
generous bandwidth limit; however I prefer not to rely on its
security. [4] I'm excited any time I can make use of this VPS without
risking my data - mostly just by tunneling SSH and self-signed HTTPS.
I'm hoping to add Tahoe storage to the list. I have machines with
cheap storage and power at friends' houses, but those don't have
public IPs. So, I direct my Tahoe storage nodes to establish pagekite
connections to both my frontend and the pagekite.net service as a
fallback. If you view the public test grid (as of Jul 10, 2011 anyway)
you should see a handful of storage nodes (with a prefix of amiller_)
that all appear to have connections through the same IP and port.
However, the peer IDs are distinct and allow Pagekite to correctly
route the traffic.
http://insecure.tahoe-lafs.org/
Probably most Tahoe users are able to set up their own tunnels as
needed, and won't directly benefit from pagekite. Even in my case, I
do have SSH access to my frontend, so I could just make as many SSH
tunnels as I need. I still prefer Pagekite, since I never get the ssh
command right on the first try. Also I can add a new Tahoe storage
nodes without having to log into or modify my pagekite frontends.
Diagram
======
https://docs.google.com/drawings/d/1_HURhL2ZHlYfMMeKKNUQD8EdC7uHwS_1Csd8cZFf2M8/edit?hl=en_US
How NOT to use Pagekite with Tahoe
============================
A very bad idea is to use Pagekite to expose Tahoe's HTTP web
user-interface. The HTTP interface transmits files and caps in
plaintext, and therefore requires elevated trust. This is fine as a
loopback connection to the local machine, but it shouldn't be used
over the internet, Pagekite or not. Pagekite would be okay for this
purpose if you set up HTTPS properly for the web interface.
* The pagekite backend command line option also needs to transmit a
secret key which is used to authenticate with the frontend - there are
probably better options than this
** This usage of Pagekite may introduce other vulnerabilities
(outside Tahoe's security claims), for example a frontend could leak
your IP address to strangers (there IS a Tor mode, though)
*** Least Authority here only applies when Pagekite transfers HTTPS or
SSL traffic. Pagekite can also be used to transmit plaintext HTTP
traffic, but then it's vulnerable to snooping
[1] http://pagekite.net/wiki/OpenSource/
[2] http://foolscap.lothar.com/docs/api/foolscap.negotiate.Negotiation-class.html
[3] http://foolscap.lothar.com/docs/using-foolscap.html#auto7
[4] http://imgur.com/SqFOG
--
Andrew Miller
More information about the tahoe-dev
mailing list