This page is out of date. See wiki:NewAccountingDesign .

Quota management is about protecting the Storage Servers. Given a limited
amount of storage space, how should it be allocated between the users? The
admin of a server may give out space in exchange for money, for other storage
space, or out of friendship, but many use cases work better if the admin can
track and control how much space any given user can consume.

There are roughly three different approaches to this: '''secret-based''',
'''private-key-based''', and '''FURL-based'''. From certain points of view, these are all
equivalent: a private key is like a remote capability that can be used to
create attenuated capabilities on demand, which is kind of like an asymmetric
form of hash derivation. The differences consist of tradeoffs between CPU
usage, storage space, and online message exchanges: it is possible to reduce
the dependence upon a central server being available by using signed messages
and certificates that can be generated and verified offline, at the cost of
CPU space and complexity. Likewise the online check can be replaced by a big
table of shared secrets, at the cost of storage space.

All of these schemes use the same core design: each share kept on a Storage
Server is associated with a list of leases. Each lease has an "account
number" that refers to some particular user/account that we might like to
track. The storage servers have the ability to sum the size of all their
shares by account number, to report that account number 4 is consuming 2GB.
By adding these reports across all storage servers, we discover that account
4 is using a total of 10GB. Some other mechanism is used to give a name
(Alice) to account 4. This mechanism might also be able to instruct the
storage servers to stop accepting new leases for that account until it
returns below-quota.

The schemes differ in the way that we decide which account number to put in
the lease. If accounts are in use, the client must be confined to use a
specific (authorized) account number: Bob should not be able to get leases
placed using Alice's account number, and Carol (who is not a valid user)
should not be able to place leases at all.

The main design goals we seek to attain here are:

 * optional central control over storage space consumed ("quotas") and
   ability to participate in the grid at all (storage authority). Grids
   which do not wish to maintain this level of control are not required
   to use Account Servers at all. Grids can also enable opt-in quota
   checking, in which clients are trusted to provide their correct
   account number, and to refrain from stealing each other's accounts.
 * when an account is added, the user should be able to use the grid
   immediately
 * when an account is removed, the user should eventually be prohibited
   from using the grid. It is ok if it takes a month for this revocation
   to take effect.
 * most functionality should continue uninterrupted even if a central Account
   Server falls offline

Secondary design goals (not all of which we can meet) are:

 * clients should be able to easily delegate their storage authority to
   someone else, like a trusted Helper or a Repair agent. These two agents
   may need to create leases in the user's name.
 * clients should be able to delegate storage authority for servers they
   haven't met yet. Specifically:
   * the Repairer may be creating leases on new storage servers that were
     added while the client was offline (otherwise the client would be
     repairing its own files).
   * one purpose of the Helper is to allow clients to be unaware of all
     storage servers. If this is the case, the client won't know which
     storage servers it should be delegating authority for.
 * Storage authority may need to be passed through the web API as a query
   parameter. The WUI (the human-facing side) may need a storage-authority
   mechanism as well, perhaps through cookies.
 * storage requirements should remain sensible. A large grid may have one
   million accounts: the storage servers may need to record a shared secret
   for each one, but it would be nicer if they didn't have to.
 * it should be possible to delegate limited authority. It would be nice if
   we could run Helpers on untrusted machines, but it the Helper gets to
   consume the full quotas of all clients who use it, then it must be trusted
   to not do that. If we could delegate just 2MB of storage authority, or
   authority that expired after an hour, we could use more machines for these
   services.

My current plan is to pursue the "secret-based" approach described here. The
other approaches are summarized in subsequent sections.

== Secret-based "storage authority" approach ==

In this scheme, each user has a master storage authority secret: just a
random string, either 128 or 256 bits long. They also have a unique (but
non-secret) "Account Number". In centrally-managed grids, these are both
created and stored by an Account Server, which uses sequential account
numbers. In friend-nets, there is no Account Server, account numbers are
either made unique by agreement or by making them large and random, and there
is more manual work involved to distribute the various secrets.

Each time a client performs a storage operation, it does so under the
auspices of a specific storage authority. The Tahoe node on Alice's computer
is running solely for the benefit of Alice, so all operations it performs
will use Alice's storage authority (i.e. all leases that it creates will
include Alice's account number). On the other hand, a shared Tahoe node
accessed through its web-API port may be working for a variety of users. This
node must be given the storage authority to use for each operation. The means
to do this is still under investigation: adding an account= query argument is
one approach, passing the information through cookies is another, each with
their own advantages and drawbacks.

Each time the client talks to a storage server, it computes a per
(account*SS) secret by hashing the master secret with the Storage Server's
nodeid (the "SSid"). It then prepends the account number, resulting in a
per-SS authority string that looks like 123-lmaypcuoh6c4l3icvvloo2656y. The
Storage Server has a function that takes this string and decides whether or
not the authority is valid, and if valid, which account number to use.

(the "123" account number is used as an index when communicating with the AS,
to avoid requiring a complete table of user-to-secret mappings. This might
not be the same account number that is used in the final lease, to allow the
creation of "temporary account numbers" that are attenuated in some way, like
a short validity period or limited to a certain number of bytes)

For friend-nets, this "authority-is-valid" function is implemented by a
simple static table lookup. The storage server has a file named
NODE/private/valid-accounts, that contains one line per account. Each line
looks like "123-lmaypcuoh6c4l3icvvloo2656y 123 Alice", and contains the
authority string, followed by the account number to use, followed by the
account's nickname. The client node has a function that creates one of these
lines for every known storage server, and the user can send each line to the
SS's admin and ask them to add it to their valid-accounts file. By doing so,
the SS's admin is granting that user the right to claim storage on the
account named "Alice".

For a centrally-managed grid, there is a special Account Server node which
manages these authorities. Each storage server is configured with a reference
to the AS by providing NODE/account-server.furl . The authority-is-valid
function works by telling the AS the (SSID,authority-string) pair for each
query. The AS responds with either a "no" or a "yes, account=123" answer.

To reduce network traffic and improve tolerance to AS downtime, the SS
maintains a cache of positive responses. The cache entries are aged out after
one month. Negative responses are not cached.

Furthermore, the AS gets to pre-emptively manipulate this cache. When the SS
connects to the AS, it makes its "valid-accounts-manager" object available to
it, and this manager object gives the AS complete control over the
valid-accounts table.

A user starts by creating a new account, using some AS-specific mechanism.
(in the case of the allmydata.com commercial grid, this uses a PHP script
that also accepts credit-card payments). The AS records the new user's
storage authority in a table, which is used to answer the subsequent
"authority-is-valid" queries from storage servers. It extracts the account
number from the authority string, looks up the corresponding table entry,
hashes the secret it finds there with the SSID, then compares the result to
the authority string.

When the AS creates a new account, it also creates authority strings for all
current storage servers, and uses its valid-accounts-manager connections to
push these strings into the SS caches. This improves availability: even if
the AS fell over and died a moment later, the new user would still be able to
use their storage authority for a month without problems.

If the AS deletes an account (because the user has stopped paying their
bills), the AS uses its valid-accounts-manager connection to delete the cache
entries for that account. This accomplishes fast revocation for all storage
servers that are currently online. Any SS which were offline at the time of
the account termination will continue to provide service for the rest of the
month. Other timings are possible: for example the SS might refresh its cache
after a day, but treat AS unavailability as meaning it should keep using the
previous answer.

=== creating attenuated authorities ===

Eventually we may want to take advantage of untrusted Helpers, by allowing
clients to create attenuated storage authority strings. The possessor of
these strings might be allowed to claim leases for a specific storage index,
or only for a certain number of bytes. The untrusted Helper might abuse this
authority, but the damage it can do is limited by the extra restrictions.

Likewise, the Repairer might get a "repair-cap" which contains enough
information to download and verify the plaintext, and enough authority to
upload new shares in the name of the original uploader. This repair-cap could
contain an authority string which can only be used to create shares for the
specific storage index. It might also be restricted to creating shares that
contain a specific root hash, to prevent the repairer from using the
authority to store its own data in the same slot (on new storage servers).

The shared-secret scheme is the least favorable for creating attenuate
authorities: it requires more work (and more network traffic) than DSA
private-key approaches. To provide this, the client contacts the AS and asks
it for an attenuated authority string: it specifies the conditions of use
(validity period, storage index restrictions, size limits, etc), and gets
back a new string and account number. The client uses this pair when
delegating authority to Helpers and Repairers, instead of their usual
(full-powered) pair.

The Account Server adds an entry to its authority table with the new number
and string. When storage servers eventually come asking about the validity of
derived strings, the AS will find this entry, read out the restrictions, and
respond to the SS with the (restrictions, real account number) pair. (one
might think of the "real account number" as a special kind of restriction:
the string grants the authority to consume space in account 123, possibly
with other restrictions).

The SS will enforce these restrictions. When the restriction involves total
storage space consumed, the SS will need to maintain a table that is indexed
by the authority string, counting bytes. This sort of restriction will be
much easier to manage if the authority includes a duration restriction,
because that will tend to limit the size of this table.

Clearly, the AS must be online and reachable to generate these attenuated
authorities. Likewise, either the AS must inform the SS about the strings and
restrictions at the time of creation (storage+network), or the AS must be
reachable by the SS when the strings are used (availability). A private-key
-based scheme would not suffer from this tradeoff.

The need for attenuated authorities is not fully established at this point,
so the relative simplicity of the shared-secret approach remains appealing
(i.e. the fact that private-key makes attenuation easier is not a string
motivation to go with DSA over shared-secret). Untrusted Helpers could
alternatively be required to establish leases under their own authority, in a
make-before-break handoff: the Helper uploads the shares and adds temporary
helper leases, the client is informed about the shares and establishes its
own leases, then the helper cancels its temporary leases. Likewise the
Repairer could maintain its own leases on behalf of offline clients, keeping
track of how much space it is consuming for whom, and reporting the quota
data to the accounting machinery (and perhaps simply refusing to repair
accounts which have gone over-quota). When the client comes back online and
syncs up with the Repairer, the client can establish its own leases on the
repairer-generated shares, allowing the repairier to drop the temporary ones.

These temporary-leases would add traffic but would remove the need to
delegate storage authority to the helper, removing some of the need for
attenuated authorities.

Another conceivable use for attenuated authority would be to give "drop-box"
access to other users: provide a write-cap (or a hypothetical append-cap) to
a directory, along with enough storage authority to use it. This would allow
Alice to safely grant non-account-holders the ability to send her files. The
current accounting mechanisms we are developing do not allow this:
non-account-holders can read files, but not add new ones.


== DSA private-key -based "membership card" approach ==

In this scheme, each user has a DSA private key, and a "membership card"
(signed by a central Account Server) that declares that the associated pubkey
is an authorized member of the grid. Clients then use that key to sign
messages that are sent to the storage servers. The SS will verify the
signatures before accepting lease requests. Attenuated authority is
implemented by certificate chains: the client creates a new private/public
key pair, uses their main key to sign a cert declaring that the new key has
certain (limited) powers, then gives the new privkey and certificate chain to
the delegate. The delegate uses the new privkey to sign request messages.

To handle revocation, the certificates need to have expiration dates, and get
replaced every once in a while.

This approach maximizes offline-ness: the Account Server is only known by its
public key, and almost never needs to be spoken to directly. It also
minimizes storage requirements: everything can be computed from the
certificates that are passed around. PK operations are somewhat expensive,
but this can be mitigated through more protocol work: creating a
limited-purpose shared secret on-demand that effectively caches the PK
verification check.


== FURL-based "managed introducer" approach ==

Here's a basic plan for how to configure "managed introducers". The basic
idea is that we have two types of grids: managed and unmanaged. The current
code implements "unmanaged" grids: complete free-for-all, anyone who can get
to the Introducer can thus get to all the servers, anyone who can get to a
server gets to use as much space as they want. In this mode, each client uses
their 'introducer.furl' to connect to the the Introducer, which serves two
purposes: tell the client about all the servers they can use, and tell all
other clients about the server being offered by the new node.

The "managed introducer" approach is for an environment where you want to be
able to keep track of who is using what, and to prevent unmanaged clients
from using any storage space.

In this mode, we have an Account Manager instead of an Introducer. Each
client gets a special, distinct facet on this account manager: this gives
them control over their account, and allows them to access the storage space
enabled by virtue of having that account. This is stored in
"my-account.furl", which replaces "introducer.furl" for this purpose.

In addtion, the servers get an "account-manager.furl" instead of an
"introducer.furl". The servers connect to this object and offer themselves as
storage servers. The Account Manager remembers a list of all the
currently-available storage servers.

When a client wants more storage servers (perhaps updated periodically, and
perhaps using some sort of minimal update protocol (Bloom Filters!)), they
contact their Account object and ask for introductions to storage servers.
This causes the Account Manager to go to all servers that the client doesn't
already know about and tell them "generate a FURL to a facet for the benefit
of client 123. Give me that FURL.". The Account Manager then sends the list
of new FURLs to the client, who adds them to its peerlist. This peerlist
contains tuples of (nodeid, FURL).

The Storage Server will grant a facet to anyone that the Account Manager
tells them to. The Storage Server is really just updating a table that maps
from a random number (the FURL's swissnum) to the system-wide small-integer
account number. The FURL will dereference to an object that adds an
accountNumber=123 to all write() calls, so that they can be stored in leases.


In this approach, the Account Manager is a bottleneck only for the initial
contact: the clients all remember their list of Storage Server FURLs for a
long time. Clients must contact their Account to take advantage of new
servers: the update traffic for this needs to be examined. I can imagine this
working reasonably well up to a few hundred servers and say 100k clients if
the clients are only asking about new servers once a day (one query per
second).