[tahoe-dev] several newbie questions

Thu Apr 21 07:52:28 PDT 2011

Hi Zooko,

Thanks for the detailed reply!

Zooko O'Whielacronx wrote:
> On Thu, Apr 21, 2011 at 6:58 AM, Miles Fidelman
> <mfidelman at meetinghouse.net>  wrote:
>    
>> 2. Looking at the capability mechanisms, it's not clear to what extent
>> capabilities are bound to individuals.  The standard problem with key-based
>> capabilities mechanisms is that they can be copied.
>>      
> On the other hand, various techniques which go under the rubric of
> "DRM" can incompletely prevent people from doing that, and as far as
> I've thought about it, Tahoe-LAFS does not prevent you from using
> those techniques to deter people from sharing access to data.
>    

One of the long-standing arguments against capability-based security has 
been transitivity (I give you a capability, you can just pass it along) 
- vs. centralized access control servers (e.g., Kerberos).

In response, there's been a lot of work that's focused on 
cryptographically binding capabilities to individuals - e.g. by using an 
individual's public key to generate a capability that's unique to that 
individual.  There's a really good summary, with literature links, at 
http://www.erights.org/elib/capability/ode/overview.html.

> This is one of those topics where generalized statements about what is
> good and bad generate more heat than light, but concrete technical
> details can help everyone understand better.
>
> So, if you want to pursue this topic, please share with us what you
> goals are and we can discuss details about how such goals could be
> met.
>    

Well... since you asked....  I'm  basically looking for the holy grail - 
a massively distributed file, multi-access, p2p, secure file store :-)

More seriously, I've been looking for a file system to underlie some 
technology development for very dispersed P2P collaboration - a general 
purpose, massively distributed, disconnection-tolerant filesystem is 
really a key component.  I've been exploring several different avenues:

- traditional cluster file systems - but I've yet to see one that works 
across the wide area (Ceph looks interesting but is still way early in 
its development)

- noSQL databases (particularly CouchDB for documents, RIAK for 
key-value stores) - these are pretty viable and production ready at this 
point (massive replication is a sort-of-understood approach, dating back 
to nntp, but isn't all that secure, and wastes a lot of storage space)

- some of the DHT-based approaches look really interesting - but none of 
them seem to have really developed into mature capabilities (e.g., 
WheelFS as the most recent incarnation)

- P2P networks (e.g., gnutella)

- distributed version control systems (notably GIT and DARCS)

- and then there's the train of development based on dispersed storage 
and erasure coding - dating back to Oceanstore, with Cleversafe and 
Tahoe-lafs as the latest incarnations - and I keep getting drawn back to 
Tahoe - the Tiddlywiki in Tahoe implementation is along the same lines 
as what we're pursuing, which leads to two fairly strict requirements: 
consistency control / conflict resolution mechanisms, and fairly 
granular access control -- hence my inquiries

I expect we'll end up going with either a noSQL, gnutella, or DCVS 
approach (or a hybrid thereof) - but I keep hoping to find a more 
general underlying platform - and Tahoe is the closest I've found.

Thanks,

Miles Fidelman

-- 
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra