wiki:NewAccountingDesign

Version 4 (modified by zooko, at 2012-03-23T04:49:31Z) (diff)

add link to a word ("LeaseCrawler?")

originally from https://tahoe-lafs.org/pipermail/tahoe-dev/2010-December/005748.html :

Each time we cycle around this topic, we chip away at the complexity, prune back some of the loftier goals, haggle for a couple of weeks, then throw up our hands and go back to our day jobs for another couple months. (See AccountingDesign and QuotaManagement for previous iterations.)

This time around, here are the previous lofty goals that I [Brian] am going to move into the non-goal category:

  • repairer: who pays for the new share?
  • sub-accounts, delegation, allmydata partners
  • public webapi node: extending accounting beyond node and through webapi/WUI: when Bob uses a public WUI, how can his shares be counted against his quota instead of the webapi operator's?

and I want to move a set of things into a "phase 2" category, to be figured out after we get the "phase 1" stuff done:

  • Invitations
  • transitive introductions
  • account managers
  • pay-for-storage
  • tit-for-tat

Also, we have a new inspiration: we've been talking a lot about the work of Elinor Ostrom, who has written a lot about communities who successfully manage common resources without suffering from the well-known "Tragedy Of The Commons". She established a set of principles that these communities had in common:

  1. Clearly defined boundaries (effective exclusion of external unentitled parties);
  2. Rules regarding the appropriation and provision of common resources are adapted to local conditions;
  3. Collective-choice arrangements allow most resource appropriators to participate in the decision-making process;
  4. Effective monitoring by monitors who are part of or accountable to the appropriators;
  5. There is a scale of graduated sanctions for resource appropriators who violate community rules;
  6. Mechanisms of conflict resolution are cheap and of easy access;
  7. The self-determination of the community is recognized by higher-level authorities;
  8. In the case of larger common-pool resources: organization in the form of multiple layers of nested enterprises, with small local CPRs at the base level.

In the Tahoe context, that means that the participants of a given grid (both clients using storage and servers providing storage) should have a lot of information about who is using what on where, and should have control over that space (being able to say no). There should be some obviously public broadcast channels, so everybody knows that everybody else knows who is using what.

So, in my current line of thinking, an Accounting Phase 1 is looking like this:

* tahoe.cfg:storage gets some new flags:

  • accounting=enabled
    • this turns on the lease-owner DB. Existing shares are marked 'anonymous'. New shares that arrive through the old RIStorageServer interface are labeled according to the TubID of the other end of the connection. New shares that arrive through the new RIAccountableStorageServer interface are labeled according to the account under which that interface object was created (see below).
  • accounting=required
    • this reads "storage-accounts.txt" for a list of accounts. Each contains a pubkey, a petname, and maybe some additional information (either local notes, or self-describing data sent by the privkey holder)
    • the RIStorageServer interface no longer accepts shares. Only RIAccountableStorageServer accepts them.

* tahoe.cfg:client gets some new flags

  • actually it needs to be in private/ somewhere
  • add a privkey. If present, clients will connect to RIStorageServer, then attempt to upgrade to RIAccountableStorageServer by sending a signed upgrade request
  • clients do all their storage ops through the RIAccountableStorageServer, which causes their shares to be labeled
  • RIAccountableStorageServer also includes get-my-total-usage methods

* the welcome page gets a new control panel

  • not sure if it needs to be user-private or not
  • storage-server panel:
    • contains lists of accounts that are consuming your storage
    • if accounting=required, add buttons to freeze/thaw the account, cautious button to delete all shares
  • client panel:
    • contains lists of servers that are holding your shares
  • combo "grid" panel:
    • contains both, correlated

* maybe broadcast channel of activity

  • daily, maybe at first hourly digest of aggregate usage
    • "Bob uploaded 62MB of data". "Alice downloaded 146MB of data"
    • "Bob is currently using 3.5GB of storage space"
    • "Alice is currently hosting 4.2GB of shares and has 0.8GB free"
  • also include new-server, new-client events
    • "Carol joined the grid, offering 3.0GB of storage space"
    • "Dave invited Edgar to join the grid"
  • and server-admin actions
    • "Carol froze Bob's shares: dude, you're using too much"
    • "David deleted Alice's shares: you unfriended me on facebook so I'm deleting all your data"
  • also generalized chat
    • "Bob says: anyone up for pizza tonight?"

* storage server needs a new crawler

  • or the existing LeaseCrawler needs some new features
  • shares contain canonical lease info, but local who-is-consuming-what and remote get-my-total-usage methods need pre-generated totals
  • once usage DB is complete, new shares are added at time of upload
  • but we must be able to generate/regenerate usage DB from just the shares (er, just shares plus table of ownerid->account data, since share.lease.ownerid field is too small)

* RIStorageServer gets new upgrade method

  • accepts a signed request, returns RIAccountableStorageServer facet
  • request needs to be scoped correctly: server1 should not be able to get Alice's facet on server2. Request should include serverid.
  • for transitive introductions, request may also contain recommendations / certchains / introduction path
  • upgrade method may fail when server doesn't like the client
  • might be a temporary failure: the upgrade request might get elevated to the storage server admin for approval. Might want "try again later (at time=T)" response code.
  • storage requests to RIAccountableStorageServer might fail if server-admin freezes or cancels the account. get-my-total-usage should keep working in many cases.

The general idea is that new (accounting-aware) clients will use a new storage-server object, one which is dedicated just to a specific account (as defined by an ECDSA pubkey), which will keep track of how much they're using. To access the new object, they'll submit a signed request to some publically-visible object (either the old RIStorageServer FURL, or a new account-desk FURL published next to the old one in the #466 dict-based announcement). The new object will also have methods to query current usage. Clients ought to keep track of how much they've uploaded to servers, but since they haven't ever done this in the past, it may be hard to get an accurate count without relying upon each server.

The Introducer will also need to help create the broadcast channel, either by presenting broadcast messages as Announcements, or by running a pubsub service directly.

If we complete this, and we implement some small UI/tahoe.cfg for it, I think we'll have a good starting point for friend-net grids:

  • each client will generate a keypair at startup
  • all shares (well, all leases) will be associated with a specific client (with a nickname)
  • servers will accept shares from anybody, but they'll have a table of who is using how much. An attacker can still create thousands of throwaway accounts, but it'll be obvious that this is going on. In the non-attacker case, all participants will be able to view how space is being used by each other.

Then we can start to figure out the not-so-friendly grids, where you don't accept data from just anybody. To get there, we'll need to start with explicitly-configured whitelists (a list of client pubkey identifiers who can store data, and nobody else is accepted), and then quickly move to a better (i.e. more automatic) workflow like Invitations or Account Managers or something. Eventually that may get us back to more traditional economics where a node can accept money of some sort (BTC!) to enable storage for a given client.