source: trunk/docs/key-value-store.rst

Last change on this file was 91047bf, checked in by Brian Warner <warner@…>, at 2016-12-12T21:57:28Z

docs: clean up .rst and references

This uses Read-The-Docs (sphinx/docutils) references exclusively, but adds a
README.md for GitHub? viewers to remind them that the links there won't
work (closes ticket:2835).

It also fixes all the dangling references and other Sphinx warnings.

The "Preparation" section of docs/magic-folder-howto.rst was removed, since
this feature has since been merged to trunk.

  • Property mode set to 100644
File size: 6.4 KB
Line 
1.. -*- coding: utf-8-with-signature-unix; fill-column: 77 -*-
2
3********************************
4Using Tahoe as a key-value store
5********************************
6
7There are several ways you could use Tahoe-LAFS as a key-value store.
8
9Looking only at things that are *already implemented*, there are three
10options:
11
121. Immutable files
13
14   API:
15
16    * key ← put(value)
17
18      This is spelled "`PUT /uri`_" in the API.
19
20      Note: the user (client code) of this API does not get to choose the key!
21      The key is determined programmatically using secure hash functions and
22      encryption of the value and of the optional "added convergence secret".
23
24    * value ← get(key)
25
26      This is spelled "`GET /uri/$FILECAP`_" in the API. "$FILECAP" is the
27      key.
28
29   For details, see "immutable files" in :doc:`performance`, but in summary:
30   the performance is not great but not bad.
31
32   That document doesn't mention that if the size of the A-byte mutable file
33   is less than or equal to `55 bytes`_ then the performance cost is much
34   smaller, because the value gets packed into the key. Added a ticket:
35   `#2226`_.
36
372. Mutable files
38
39   API:
40
41    * key ← create()
42
43      This is spelled "`PUT /uri?format=mdmf`_".
44
45      Note: again, the key cannot be chosen by the user! The key is
46      determined programmatically using secure hash functions and RSA public
47      key pair generation.
48
49    * set(key, value)
50
51    * value ← get(key)
52
53      This is spelled "`GET /uri/$FILECAP`_". Again, the "$FILECAP" is the
54      key. This is the same API as for getting the value from an immutable,
55      above. Whether the value you get this way is immutable (i.e. it will
56      always be the same value) or mutable (i.e. an authorized person can
57      change what value you get when you read) depends on the type of the
58      key.
59
60   Again, for details, see "mutable files" in :doc:`performance` (and
61   `these tickets`_ about how that doc is incomplete), but in summary, the
62   performance of the create() operation is *terrible*! (It involves
63   generating a 2048-bit RSA key pair.) The performance of the set and get
64   operations are probably merely not great but not bad.
65
663. Directories
67
68   API:
69
70    * directory ← create()
71
72      This is spelled "`POST /uri?t=mkdir`_".
73
74      :doc:`performance` does not mention directories (`#2228`_), but in order
75      to understand the performance of directories you have to understand how
76      they are implemented. Mkdir creates a new mutable file, exactly the
77      same, and with exactly the same performance, as the "create() mutable"
78      above.
79
80    * set(directory, key, value)
81
82      This is spelled "`PUT /uri/$DIRCAP/[SUBDIRS../]FILENAME`_". "$DIRCAP"
83      is the directory, "FILENAME" is the key. The value is the body of the
84      HTTP PUT request. The part about "[SUBDIRS../]" in there is for
85      optional nesting which you can ignore for the purposes of this
86      key-value store.
87
88      This way, you *do* get to choose the key to be whatever you want (an
89      arbitrary unicode string).
90
91      To understand the performance of ``PUT /uri/$directory/$key``,
92      understand that this proceeds in two steps: first it uploads the value
93      as an immutable file, exactly the same as the "put(value)" API from the
94      immutable API above. So right there you've already paid exactly the
95      same cost as if you had used that API. Then after it has finished
96      uploading that, and it has the immutable file cap from that operation
97      in hand, it downloads the entire current directory, changes it to
98      include the mapping from key to the immutable file cap, and re-uploads
99      the entire directory. So that has a cost which is easy to understand:
100      you have to download and re-upload the entire directory, which is the
101      entire set of mappings from user-chosen keys (Unicode strings) to
102      immutable file caps. Each entry in the directory occupies something on
103      the order of 300 bytes.
104
105      So the "set()" call from this directory-based API has obviously much
106      worse performance than the the equivalent "set()" calls from the
107      immutable-file-based API or the mutable-file-based API. This is not
108      necessarily worse overall than the performance of the
109      mutable-file-based API if you take into account the cost of the
110      necessary create() calls.
111
112    * value ← get(directory, key)
113
114      This is spelled "`GET /uri/$DIRCAP/[SUBDIRS../]FILENAME`_". As above,
115      "$DIRCAP" is the directory, "FILENAME" is the key.
116
117      The performance of this is determined by the fact that it first
118      downloads the entire directory, then finds the immutable filecap for
119      the given key, then does a GET on that immutable filecap. So again,
120      it is strictly worse than using the immutable file API (about twice
121      as bad, if the directory size is similar to the value size).
122
123What about ways to use LAFS as a key-value store that are not yet
124implemented? Well, Zooko has lots of ideas about ways to extend Tahoe-LAFS to
125support different kinds of storage APIs or better performance. One that he
126thinks is pretty promising is just the Keep It Simple, Stupid idea of "store a
127sqlite db in a Tahoe-LAFS mutable". ☺
128
129.. _PUT /uri: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#writing-uploading-a-file
130
131.. _GET /uri/$FILECAP: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#viewing-downloading-a-file
132
133.. _55 bytes: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/src/allmydata/immutable/upload.py?rev=196bd583b6c4959c60d3f73cdcefc9edda6a38ae#L1504
134
135.. _PUT /uri?format=mdmf: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#writing-uploading-a-file
136
137.. _#2226: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2226
138
139.. _these tickets: https://tahoe-lafs.org/trac/tahoe-lafs/query?status=assigned&status=new&status=reopened&keywords=~doc&description=~performance.rst&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone&order=priority
140
141.. _POST /uri?t=mkdir: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#creating-a-new-directory
142
143.. _#2228: https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2228
144
145.. _PUT /uri/$DIRCAP/[SUBDIRS../]FILENAME: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#creating-a-new-directory
146
147.. _GET /uri/$DIRCAP/[SUBDIRS../]FILENAME: https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/frontends/webapi.rst#reading-a-file
148
Note: See TracBrowser for help on using the repository browser.