[tahoe-dev] Accounting, 2010 edition

Brian Warner warner at lothar.com
Mon Dec 20 06:51:04 UTC 2010


It's December, which means it's time to talk about Accounting again[1].

Each time we cycle around this topic, we chip away at the complexity,
prune back some of the loftier goals, haggle for a couple of weeks, then
throw up our hands and go back to our day jobs for another couple
months.

This time around, here are the previous lofty goals that I'm going to
move into the non-goal category:

   - repairer: who pays for the new share?
   - sub-accounts, delegation, allmydata partners
   - public webapi node: extending accounting beyond node and through
     webapi/WUI: when Bob uses a public WUI, how can his shares be
     counted against his quota instead of the webapi operator's?

and I want to move a set of things into a "phase 2" category, to be
figured out after we get the "phase 1" stuff done:

   - Invitations
   - transitive introductions
   - account managers
   - pay-for-storage
   - tit-for-tat

Also, we have a new inspiration: we've been talking a lot about the work
of Elinor Ostrom[2], who has written a lot about communities who
successfully manage common resources without suffering from the
well-known "Tragedy Of The Commons". She established a set of principles
that these communities had in common:

   1. Clearly defined boundaries (effective exclusion of external
      unentitled parties);
   2. Rules regarding the appropriation and provision of common
      resources are adapted to local conditions;
   3. Collective-choice arrangements allow most resource appropriators
      to participate in the decision-making process;
   4. Effective monitoring by monitors who are part of or accountable to
      the appropriators;
   5. There is a scale of graduated sanctions for resource appropriators
      who violate community rules;
   6. Mechanisms of conflict resolution are cheap and of easy access;
   7. The self-determination of the community is recognized by
      higher-level authorities;
   8. In the case of larger common-pool resources: organization in the
      form of multiple layers of nested enterprises, with small local
      CPRs at the base level.

In the Tahoe context, that means that the participants of a given grid
(both clients using storage and servers providing storage) should have a
lot of information about who is using what on where, and should have
control over that space (being able to say no). There should be some
obviously public broadcast channels, so everybody knows that everybody
else knows who is using what.

So, in my current line of thinking, an Accounting Phase 1 is looking
like this:

*** tahoe.cfg:storage gets some new flags:
    - accounting=enabled
      - this turns on the lease-owner DB. Existing shares are marked
        'anonymous'. New shares that arrive through the old
        RIStorageServer interface are labeled according to the TubID of
        the other end of the connection. New shares that arrive through
        the new RIAccountableStorageServer interface are labeled
        according to the account under which that interface object was
        created (see below).
    - accounting=required
      - this reads "storage-accounts.txt" for a list of accounts. Each
        contains a pubkey, a petname, and maybe some additional
        information (either local notes, or self-describing data sent by
        the privkey holder)
      - the RIStorageServer interface no longer accepts shares. Only
        RIAccountableStorageServer accepts them.

*** tahoe.cfg:client gets some new flags
    - actually it needs to be in private/ somewhere
    - add a privkey. If present, clients will connect to
      RIStorageServer, then attempt to upgrade to
      RIAccountableStorageServer by sending a signed upgrade request
    - clients do all their storage ops through the
      RIAccountableStorageServer, which causes their shares to be
      labeled
    - RIAccountableStorageServer also includes get-my-total-usage
      methods

*** the welcome page gets a new control panel
    - not sure if it needs to be user-private or not
    - storage-server panel:
      - contains lists of accounts that are consuming your storage
      - if accounting=required, add buttons to freeze/thaw the account,
        cautious button to delete all shares
    - client panel:
      - contains lists of servers that are holding your shares
    - combo "grid" panel:
      - contains both, correlated

*** maybe broadcast channel of activity
    - daily, maybe at first hourly digest of aggregate usage
      - "Bob uploaded 62MB of data". "Alice downloaded 146MB of data"
      - "Bob is currently using 3.5GB of storage space"
      - "Alice is currently hosting 4.2GB of shares and has 0.8GB free"
    - also include new-server, new-client events
      - "Carol joined the grid, offering 3.0GB of storage space"
      - "Dave invited Edgar to join the grid"
    - and server-admin actions
      - "Carol froze Bob's shares: dude, you're using too much"
      - "David deleted Alice's shares: you unfriended me on facebook so
        I'm deleting all your data"
    - also generalized chat
      - "Bob says: anyone up for pizza tonight?"

*** storage server needs a new crawler
    - or the existing LeaseCrawler needs some new features
    - shares contain canonical lease info, but local
      who-is-consuming-what and remote get-my-total-usage methods need
      pre-generated totals
    - once usage DB is complete, new shares are added at time of upload
    - but we must be able to generate/regenerate usage DB from just the
      shares (er, just shares plus table of ownerid->account data, since
      share.lease.ownerid field is too small)

*** RIStorageServer gets new upgrade method
    - accepts a signed request, returns RIAccountableStorageServer facet
    - request needs to be scoped correctly: server1 should not be able
      to get Alice's facet on server2. Request should include serverid.
    - for transitive introductions, request may also contain
      recommendations / certchains / introduction path
    - upgrade method may fail when server doesn't like the client
    - might be a temporary failure: the upgrade request might get
      elevated to the storage server admin for approval. Might want "try
      again later (at time=T)" response code.
    - storage requests to RIAccountableStorageServer might fail if
      server-admin freezes or cancels the account. get-my-total-usage
      should keep working in many cases.

The general idea is that new (accounting-aware) clients will use a new
storage-server object, one which is dedicated just to a specific account
(as defined by an ECDSA pubkey), which will keep track of how much
they're using. To access the new object, they'll submit a signed request
to some publically-visible object (either the old RIStorageServer FURL,
or a new account-desk FURL published next to the old one in the #466
dict-based announcement). The new object will also have methods to query
current usage. Clients ought to keep track of how much they've uploaded
to servers, but since they haven't ever done this in the past, it may be
hard to get an accurate count without relying upon each server.

The Introducer will also need to help create the broadcast channel,
either by presenting broadcast messages as Announcements, or by running
a pubsub service directly.


If we complete this, and we implement some small UI/tahoe.cfg for it, I
think we'll have a good starting point for friend-net grids:

 - each client will generate a keypair at startup
 - all shares (well, all leases) will be associated with a specific
   client (with a nickname)
 - servers will accept shares from anybody, but they'll have a table of
   who is using how much. An attacker can still create thousands of
   throwaway accounts, but it'll be obvious that this is going on. In
   the non-attacker case, all participants will be able to view how
   space is being used by each other.

Then we can start to figure out the not-so-friendly grids, where you
don't accept data from just anybody. To get there, we'll need to start
with explicitly-configured whitelists (a list of client pubkey
identifiers who can store data, and nobody else is accepted), and then
quickly move to a better (i.e. more automatic) workflow like Invitations
or Account Managers or something. Eventually that may get us back to
more traditional economics where a node can accept money of some sort
(BTC!) to enable storage for a given client.


Sound like fun? Thoughts?

cheers,
 -Brian



[1]: I'm kidding. We talk about Accounting on tuesdays, regardless of
     what month it is.
[2]: http://en.wikipedia.org/wiki/Elinor_Ostrom


More information about the tahoe-dev mailing list