[tahoe-dev] Grid ID ideas

Sat May 10 14:04:28 PDT 2008

Brian

Have you thought about the situation where you want to have multiple  
grids, with the same set of files for the case of a datacenter going  
down? The idea being as a company you could setup the HA aspect (DNS,  
load balanacing etc) on the front-end, and on the backend have  
multiple grids connected together only for upload aspects. You  
wouldn't want a client who connects to datacenter A, to pull data from  
B because of latency. The front-end would handle the closest  
geolocation grid.

Would some of these ideas you present below help with that situation?

On May 10, 2008, at 1:49 PM, Brian Warner wrote:
> I had some crazy ideas about how to handle a "grid ID" last night.. I
> think this scheme might actually work. Here's my writeup. It's not
> particularly coherent, but I think the basic ideas are captured.
>
> -Brian
>
>
> = Grid Identifiers =
>
> What makes up a Tahoe "grid"? The rough answer is a fairly-stable  
> set of
> Storage Servers.
>
> The read- and write- caps that point to files and directories are  
> scoped to a
> particular set of servers. The Tahoe peer-selection and erasure-coding
> algorithms provide high availability as long as there is significant  
> overlap
> between the servers that were used for upload and the servers that are
> available for subsequent download. When new peers are added, the  
> shares will
> get spread out in the search space, so clients must work harder to  
> download
> their files. When peers are removed, shares are lost, and file  
> health is
> threatened. Repair bandwidth must be used to generate new shares, so  
> cost
> increases with the rate of server departure. If servers leave the  
> grid too
> quickly, repair may not be able to keep up, and files will be lost.
>
> So to get long-term stability, we need that peer set to remain  
> fairly stable.
> A peer which joins the grid needs to stick around for a while.
>
> == Multiple Grids ==
>
> The current Tahoe read-cap format doesn't admit the existence of  
> multiple
> grids. In fact, the "URI:" prefix implies that these cap strings are
> universal: it suggests that this string (plus some protocol  
> definition) is
> completely sufficient to recover the file.
>
> However, there are a variety of reasons why we may want to have more  
> than one
> Tahoe grid in the world:
>
> * scaling: there are a variety of problems that are likely to be  
> encountered
>   as we attempt to grow a Tahoe grid from a few dozen servers to a few
>   thousand, some of which are easier to deal with than others.  
> Maintaining
>   connections to servers and keeping up-to-date on the locations of  
> servers
>   is one issue. There are design improvements that can work around  
> these,
>   but they will take time, and we may not want to wait for that work  
> to be
>   done. Begin able to deploy multiple grids may be the best way to  
> get a
>   large number of clients using tahoe at once.
>
> * managing quality of storage, storage allocation: the members of a
>   friendnet may want to restrict access to storage space to just  
> each other,
>   and may want to run their grid without involving any external  
> coordination
>
> * commercial goals: a company using Tahoe may want to restrict  
> access to
>   storage space to just their customers
>
> * protocol upgrades, development: new and experimental versions of  
> the tahoe
>   software may need to be deployed and analyzed in isolation from  
> the grid
>   that clients are using for active storage
>
> So if we define a grid to be a set of storage servers, then two  
> distinct
> grids will have two distinct sets of storage servers. Clients are  
> free to use
> whichever grid they like (and have permission to use), however each  
> time they
> upload a file, they must choose a specific grid to put it in.  
> Clients can
> upload the same file to multiple grids in two separate upload  
> operations.
>
> == Grid IDs in URIs ==
>
> Each URI needs to be scoped to a specific grid, to avoid confusion  
> ("I looked
> for URI123 and it said File Not Found.. oh, which grid did you  
> upload that
> into?"). To accomplish this, the URI will contain a "grid  
> identifier" that
> references a specific Tahoe grid. The grid ID is shorthand for a  
> relatively
> stable set of storage servers.
>
> To make the URIs actually Universal, there must be a way to get from  
> the grid
> ID to the actual grid. This document defines a protocol by which a  
> client
> that wants to download a file from a previously-unknown grid will be  
> able to
> locate and connect to that grid.
>
> == Grid ID specification ==
>
> The Grid ID is a string, using a fairly limited character set,  
> alphanumerics
> plus possibly a few others. It can be very short: a gridid of just  
> "0" can be
> used. The gridID will be copied into the cap string for every file  
> that is
> uploaded to that grid, so there is pressure to keep them short.
>
> The cap format needs to be able to distinguish the gridID from the  
> rest of
> the cap. This could be expressed in DNS-style dot notation, for  
> example the
> directory write-cap with a write-key of "0ZrD.." that lives on  
> gridID "foo"
> could be expressed as "D0ZrDNAHuxs0XhYJNmkdicBUFxsgiHzMdm.foo" .
>
> * design goals: non-word-breaking, double-click-pasteable, maybe
>   human-readable (do humans need to know which grid is being used?  
> probably
>   not).
> * does not need to be Secure (i.e. long and unguessable), but we must
>   analyze the sorts of DoS attack that can result if it is not (and  
> even
>   if it is)
> * does not need to be human-memorable, although that may assist  
> debugging
>   and discussion ("my file is on grid 4, where is yours?)
> * *does* need to be unique, but the total number of grids is fairly  
> small
>   (counted in the hundreds or thousands rather than millions or  
> billions)
>   and we can afford to coordinate the use of short names. Folks who  
> don't
>   like coordination can pick a largeish random string.
>
> Each announcement that a Storage Server publishes (to introducers)  
> will
> include its grid id. If a server participates in multiple grids, it  
> will make
> multiple announcements, each with a single grid id. Clients will be  
> able to
> ask an introducer for information about all storage servers that  
> participate
> in a specific grid.
>
> Clients are likely to have a default grid id, to which they upload  
> files. If
> a client is adding a file to a directory that lives in a different  
> grid, they
> may upload the file to that other grid instead of their default.
>
> == Getting from a Grid ID to a grid ==
>
> When a client decides to download a file, it starts by unpacking the  
> cap and
> extracting the grid ID.
>
> Then it attempts to connect to at least one introducer for that  
> grid, by
> leveraging DNS:
>
> hash $GRIDID id (with some tag) to get a long base32-encoded string:  
> $HASH
>
> GET http://tahoe-$HASH.com/introducer/gridid/$GRIDID
>
> the results should be a JSON-encoded list of introducer FURLs
>
> for extra redundancy, if that query fails, perform the following  
> additional
> queries:
>
>  GET http://tahoe-$HASH.net/introducer/gridid/$GRIDID
>  GET http://tahoe-$HASH.org/introducer/gridid/$GRIDID
>  GET http://tahoe-$HASH.tv/introducer/gridid/$GRIDID
>  GET http://tahoe-$HASH.info/introducer/gridid/$GRIDID
>   etc
>  GET http://tahoe-grids.allmydata.com/introducer/gridid/$GRIDID
>
> The first few introducers should be able to announce other  
> introducers, via
> the distributed gossip-based introduction scheme of #68.
>
> Properties:
>
> * claiming a grid ID is cheap: a single domain name registration (in  
> an
>   uncontested namespace), and a simple web server. allmydata.com can  
> publish
>   introducer FURLs for grids that don't want to register their own  
> domain.
>
> * lookup is at least as robust as DNS. By using benevolent public  
> services
>   like tahoe-grids.allmydata.com, reliability can be increased  
> further. The
>   HTTP fetch can return a list of every known server node, all of  
> which can
>   act as introducers.
>
> * not secure: anyone who can interfere with DNS lookups (or claims
>   tahoe-$HASH.com before you do) can cause clients to connect to their
>   servers instead of yours. This admits a moderate DoS attack against
>   download availability. Performing multiple queries (to .net, .org,  
> etc)
>   and merging the results may mitigate this (you'll get their  
> servers *and*
>   your servers; the download search will be slower but is still  
> likely to
>   succeed). It may admit an upload DoS attack as well, or an upload
>   file-reliability attack (trick you into uploading to unreliable  
> servers)
>   depending upon how the "server selection policy" (see below) is
>   implemented.
>
> Once the client is connected to an introducer, it will see if there  
> is a
> Helper who is willing to assist with the upload or download. (For  
> download,
> this might reduce the number of connections that the grid's storage  
> servers
> must deal with). If not, ask the introducers for storage servers,  
> and connect
> to them directly.
>
> == Controlling Access ==
>
> The introducers are not used to enforce access control. Instead, a  
> system of
> public keys are used.
>
> There are a few kinds of access control that we might want to  
> implement:
>
> * protect storage space: only let authorized clients upload/consume  
> storage
> * protect download bandwidth: only give shares to authorized clients
> * protect share reliability: only upload shares to "good" servers
>
> The first two are implemented by the server, to protect their  
> resources. The
> last is implemented by the client, to avoid uploading shares to  
> unreliable
> servers (specifically, to maximize the utility of the client's  
> limited upload
> bandwidth: there's no problem with putting shares on unreliable  
> peers per se,
> but it is a problem if doing so means the client won't put a share  
> on a more
> reliable peer).
>
> The first limitation (protect storage space) will be implemented by  
> public
> keys and signed "storage authority" certificates. The client will  
> present
> some credentials to the storage server to convince it that the client
> deserves the space. When storage servers are in this mode, they will  
> have a
> certificate that names a public key, and any credentials that can  
> demonstrate
> a path from that key will be accepted. This scheme is described in
> docs/accounts-pubkey.txt .
>
> The second limitation is unexplored. The read-cap does not currently  
> contain
> any notion of who must pay for the bandwidth incurred.
>
> The third limitation (only upload to "good" servers), when enabled, is
> implemented by a "server selection policy" on the client side, which  
> defines
> which server credentials will be accepted. This is just like the first
> limitation in reverse. Before clients consider including a server in  
> their
> peer selection algorithm, they check the credentials, and ignore any  
> that do
> not meet them.
>
> This means that a client may not wish to upload anything to "foreign  
> grids",
> because they have no promise of reliability. The reasons that a  
> client might
> want to upload to a foreign grid need to be examined: reliability  
> may not be
> important, or it might be good enough to upload the file to the  
> client's
> "home grid" instead.
>
> The server selection policy is intended to be fairly open-ended: we  
> can
> imagine a policy that says "upload to any server that has a good  
> reputation
> among group X", or more complicated schemes that require less and less
> centralized management. One important and simple scheme is to simply  
> have a
> list of acceptable keys: a friendnet with 5 members would include 5  
> such keys
> in each policy, enabling every member to use the services of the  
> others,
> without having a single central manager with unilateral control over  
> the
> definition of the group.
>
> == Closed Grids ==
>
> To implement these access controls, each client needs to be  
> configured with
> three things:
>
> * home grid ID (used to find introducers, helpers, storage servers)
> * storage authority (certificate to enable uploads)
> * server selection policy (identify good/reliable servers)
>
> If the server selection policy indicates centralized control (i.e.  
> there is
> some single key X which is used to sign the credentials for all "good"
> servers), then this could be built in to the grid ID. By using the  
> base32
> hash of the pubkey as the grid ID, clients would only need to be  
> configured
> with two things: the grid ID, and their storage authority. In this  
> case, the
> introducer would provide the pubkey, and the client would compare  
> the hashes
> to make sure they match. This is analogous to how a TubID is used in  
> a FURL.
>
> Such grids would have significantly larger grid IDs, 24 characters  
> or more.
> _______________________________________________
> tahoe-dev mailing list
> tahoe-dev at allmydata.org
> http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev