[tahoe-dev] how to access files: FUSE, CLI, Dropbox-like-hack, etc. (was: Potential use for personal backup)

Zooko Wilcox-O'Hearn zooko at zooko.com
Tue May 22 19:45:55 UTC 2012


On Tue, May 22, 2012 at 1:17 PM, Greg Troxel <gdt at ir.bbn.com> wrote:
>
> I find this a bit bizarre (but I'm one of those unix-heads you talk about below).

You're one of my favorite unix-heads. Thank you for consistently
contributing your ideas to this project.

> But I think the reason I find it odd is that tahoe-lafs *is* a filesystem.   Now, if you mean: "it's a filesystem, but think about accessing a filesystem on another machine over scp; the current software interface that people use feels more like that than mounting a disk onto /mnt" then I see what you mean (and indeed it's true).

You're right. I don't *really* mean "Tahoe the Least-Authority File
System is not a File System". What I really mean is "Tahoe-LAFS is not
best used through the traditional POSIX semantics, and therefore it is
not best used through your operating system's Virtual File System
layer".

> tahoe certainly has slightly unusual semantics compared to POSIX (necessary because of how it works; that's not meant to be a complaint), but in many ways it's not so far off.  On top of that there is a culture of access via a command-line program or web gateway rather than OS filesystem integration (as is normal for pretty much every other filesystem), but I think that's both a current cultural artifact and a reflection that the fuse support isn't complete/etc.

Hm, well I think the FUSE support is already nearly as good as it can
be, and that's not very good.

I think the semantic mismatches -- Tahoe-LAFS immutables vs. POSIX
everything-is-mutable for starters -- mean that the FUSE layer, no
matter how well-engineered and complete, can't provide the full
functionality and efficiency that the Tahoe-LAFS layer provides.

Of course, you can always paper over any limitations of functionality
by adding caching! But that necessarily adds latency and, worst of
all, interesting new failure modes. I've always resisted the
suggestion to add caching into Tahoe-LAFS itself (#316) because I
don't want to expose users (and Tahoe-LAFS developers) to those added
failure modes and because I think caching (and prefetching) would be
done better done by a separate layer.

For example, you could imagine a separate box on your network -- a
Network Attached Storage device or "NAS" -- which serves files to you
over tried and true protocols like, uh, SMB, NFS, Gluster, or whatever
the kids are using nowadays, and which also runs a Tahoe-LAFS client
to backup or sync those files with the remote grid. The NAS can be
seen from one perspective as nothing but a huge, very smart cache for
Tahoe-LAFS.

For another example, Dropbox can be seen as a great hack to use your
local disk and your operating system's builtin filesystem as a huge,
very smart cache for the Dropbox remote sync protocol.

So to reiterate:

1. I think the FUSE layer that we already have is pretty good, for a
FUSE layer. It is well-engineered and reliable, and doesn't do
anything a lot less efficiently than it could.

2. I think any possible FUSE layer would introduce inefficiencies and
hide functionalities (like immutable files, capability access control,
easy link-based sharing, ...).

3. I think a good trend is to stop trying to stretch the POSIX
semantics across the wide-area network, but rather to use POSIX
semantics from the app to a local filesystem which basically acts as a
highly intelligent cache, and then use newer and more
Internet-friendly semantics from there across the wide-area network.
I'll bet there is a lot of value to be created by following Dropbox's
lead and adding more of that kind of functionality to Tahoe-LAFS.

Regards,

Zooko

https://tahoe-lafs.org/trac/tahoe-lafs/ticket/316# add caching to tahoe proper?


More information about the tahoe-dev mailing list