[tahoe-lafs-trac-stream] [tahoe-lafs] #959: tahoe-lafs objects
tahoe-lafs
trac at tahoe-lafs.org
Fri Jul 5 20:06:53 UTC 2013
#959: tahoe-lafs objects
-------------------------+-------------------------------------------------
Reporter: warner | Owner: nobody
Type: | Status: new
enhancement | Milestone: 2.0.0
Priority: major | Version: 1.6.0
Component: unknown | Keywords: objects validation backward-
Resolution: | compatibility forward-compatibility revocation
Launchpad Bug: |
-------------------------+-------------------------------------------------
Comment (by nejucomo):
**Summary:**
1. Nejucomo brainstorms differences between a native C-List + blob
feature, versus
1. "emulating" that with existing lafs directories and files, and
1. realizes that emulation is good enough, and
1. abandons support for any new "C-List + blob" feature, and then
1. asks if that "emulation" is good enough for live objects.
-then finally decides to post the whole brainstorm in case posterity finds
anything useful in it. ;-)
Replying to [comment:13 zooko]:
> Replying to [comment:12 nejucomo]:
> >
> > With only the first two of these bullets, the storage model becomes
"arbitrary DAGs (Directed Acyclic Graphs)" instead of only "files or
directories".
>
> I don't understand the distinction. Isn't the current LAFS files-and-
directories structure already an arbitrary DAG? Just use incrementing
integers as your filenames in your directories, and then that's a C-list.
Or am I missing something?
Yes, having an existing lafs directory to serve as a C-List and a separate
lafs file to serve as the blob, and a third container directory to bind
the two is already possible, at the cost of efficiency.
I'll call that "the emulation design" below (because it is emulating
"C-List + blob" as a built-in feature). Let's contrast emulation with a
built-in feature:
**Emulation:**
* Requires three file nodes.
* Requires the application to encode their data structure into lafs
directory names or other edge metadata.
* Is there a simple, safe way to encode arbitrary bytes into an edge
name?
* Is it possible to attach arbitrary metadata to edges?
* Cannot guarantee consistency between the C-List-emulation and blob-
emulation:
* A reader must read these separately, thus has no guarantee the two
reads return a consistent view.
* Fits the current "files and directories" abstraction well, so the user
interfaces map reasonably well.
* For example, fuse interfaces know what to do.
By contrast:
**Native C-List+Blob Feature:**
* Requires a single node for any application.
* Allows the application to generate/interpret blob "directly".
* Ensures mutation consistency between the C-List and blob in the same
manner as single file-node mutable consistency is currently ensured.
* Can represent directories as a particular application.
* Breaks the notion of "just file and directories", introducing complexity
into the user interfaces.
* For example, what does a fuse interface do with an arbitrary C-List +
blob?
* It could do the "inverse" of the emulation above: Present a
directory with a C-List subdirectory and a blob file. Now we have the
same encoding problems as with emulation, except instead of the
application which wrote the blob inventing the encoding, a completely
unaware general application (ie: fuse interface) has to pick an encoding.
After brainstorming those differences, to me it seems like the primary
advantage is mutation consistency properties.
It occurs to me that if it's already possible to encode arbitrary metadata
in the node edges, when we already have C-List + blob where the blob is
split up across the edge metadata.
Oh... I just realized: If the emulation container directory is mutable,
but the C-List-emulation child and the blob child are immutable, this
solves the consistency problem, right? A reader of the container
directory knows that the immutable C-List cap and the immutable blob-file
cap were written together by a single writer, and therefore their inter-
relationships are as consistent as the writer.
Ok, now I'm pretty much satisfied that "C-List + blob" is unnecessary, at
least for arbitrary DAGs.
Is this also true of the "live objects" proposal? If we require that the
interpretation of a "C-List + blob + object-code" structure starts with a
cap to a directory where C-List, blob, and object-code are separate but
*immutable* files, is this a sufficient building block for live objects?
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/959#comment:15>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list