[tahoe-dev] question about sharing...

Sun Jun 5 13:16:02 PDT 2011

On 6/4/11 6:59 AM, Greg Troxel wrote:
> 
> toby cabot <toby at caboteria.org> writes:
> 
>> well-known "root" and allow users to explore the contents of the
>> filesystem from there. Because the root is well-known, each user
>> would be able to do anything they wanted unless there were some sort
>> of inline permission check, so these filesystems implement "ACL"
>> permission checks to prevent users from doing things they can
>> discover, but are not permitted to do. In other words, I can discover
>> a directory's existence, but I might not be allowed to read from it.

Yeah, in other words, you can point to things that you aren't allowed to
touch. The capability model (which Tahoe follows) says "don't separate
designation from authority", which means that if you can point at
something, you can use it (and vice versa).

There are two main arguments in favor of the capability model. The first
is that it's easy to understand: it might not do what you want it to do,
but it's very clear (one might say "rigid") about what can and cannot be
done. By contrast, the unix ACL scheme takes more work and requires more
information to predict what will happen. You need to understand
directories and readdir() behavior to find names for things (as well as
symlinks and hardlinks, and the ../ behavior, and getcwd(), etc). Then,
you need to understand user accounts, group membership, file
owner/group, file modes, setuid/seteuid/setgid, sticky bits (and the
user's umask) to find out whether a given operation is likely to succeed
or not (and what file attributes will result). Sharing files between
just two users tends to require root intervention.

The second is the Confused Deputy Attack[1], which happens when someone
gives you a designation for an object (e.g. a filename) which they can't
touch, but which you can, and it's too hard for you to determine whether
they had the right to touch that object or not. This attack can't happen
in a capability-based system.

>> I understand that a trust in in a traditional FS can be actively
>> subverted in the same way that it could in Tahoe-LAFS, but what about
>> mistakes? Let's say, for example, that I want to respond to an email
>> and my reply has a capability URI in it, but I hit "reply all" by
>> mistake, or my crappy email client auto-fills "Jones, Fred" instead
>> of "Jones, Frank" and I don't realize before I send it.
> 
> That's a hugely important point. It has parallels in traditional
> filesystems - what if someone types their password or other access
> control information in the wrong window? Or if you put something
> sensitive in a reply and it doesn't get encrypted because you forgot
> to check encrypt. But URI sharing is a bit scary - I haven't really
> done that since I'm looking at tahoe as a backup mechanism more than a
> sharing mechanism.

Yeah. The capability-oriented nature of Tahoe's file-access system is
very "raw", very all-or-nothing. I feel this is the right thing to
build, though, because you can build systems on top of it that give you
the revocation properties you want. In this case, you want a revocable
forwarder: some node somewhere which holds the real readcap, and you
share a facet of that forwarder with your buddy instead of sharing the
actual readcap. The forwarder will give them the readcap when asked
(unless it's been revoked), but will also remember that it was given
out. You can ask the forwarder to revoke access, and it will tell you
whether someone's already fetched the readcap or not (i.e. whether
you're too late). You could also rig this to prevent the forwarder from
actually learning the readcap, by giving it an encrypted copy (and the
token you give to your buddy includes the decryption key).

I think it's fair to argue that a revocable tool like this ought to be
the primary user-facing sharing mechanism, and the raw readcap should be
hidden from casual users. (the biggest downsides are complexity and the
availability hit caused by a forwarder being offline). But we haven't
gotten it done yet. We started with the raw mechanism because you can
build the revocable one on top of the raw one, but not the other way
around. And there are different models you could build on top of the raw
capabilities, whereas if we'd started with a
traditional-in-the-unix-world user-account-based ACL system, those other
models would be impossible.

>> From what I've heard, with the traditional FS I can close the barn
>> door and know that it's closed. With Tahoe-LAFS I can only ask for
>> the barn door to be closed at some point in the future.
> 
> That's true for immutable files. For a directory, you can unlink the
> contents and push out a new version. But I don't think there's been
> any analysis of how effective this is.

Yeah, currently the easiest form of this "revocable forwarder" is to use
a mutable directory: you can't find out when it's been read, but you can
delete the reference to prevent subsequent reads from working.
Directories (like all tahoe objects) are made to be reliable, though,
which means they're spread out over multiple shares. To securely delete
something, you need to remove it from all the shares, which only happens
if all the servers are online when you do the delete. If some of them
were offline, and thus still have old versions of the shares, your
erstwhile sharing partner may still be able to reconstruct the filecap
from those old shares.

This is an unsolveable problem: there's a tradeoff between availability
(before you revoke) and unavailability (after you revoke). If you only
have one node holding the forwarder, then you can be pretty confident
about your ability to revoke access, but when you *do* want someone to
see the file, they're depending upon a single point of failure. If you
spread the forwarder over ten servers (as mutable directories are), then
you're highly likely to have access when you want it, but revocation
depends upon being able to reach all ten servers.

cheers,
 -Brian

[1]: http://en.wikipedia.org/wiki/Confused_Deputy