['*' means complete] Connection Management: *v1: foolscap, no relay, live=connected-to-introducer, broadcast updates, fully connected topology *v2: configurable IP address -- http://allmydata.org/trac/tahoe/ticket/22 v3: live != connected-to-introducer, connect on demand v4: decentralized introduction -- http://allmydata.org/trac/tahoe/ticket/68 v5: relay? File Encoding: *v1: single-segment, no merkle trees *v2: multiple-segment (LFE) *v3: merkle tree to verify each share *v4: merkle tree to verify each segment *v5: merkle tree on plaintext and crypttext: incremental validation v6: only retrieve the minimal number of hashes instead of all of them Share Encoding: *v1: fake it (replication) *v2: PyRS *v2.5: ICodec-based codecs, but still using replication *v3: C-based Reed-Solomon URI: *v1: really big *v2: store URI Extension with shares *v3: derive storage index from readkey v4: perhaps derive more information from version and filesize, to remove codec_name, codec_params, tail_codec_params, needed_shares, total_shares, segment_size from the URI Extension Upload Peer Selection: *v1: permuted peer list, consistent hash *v2: permute peers by verifierid and arrange around ring, intermixed with shareids on the same range, each share goes to the next-clockwise-available peer v3: reliability/goodness-point counting? v4: denver airport (chord)? Download Peer Selection: *v1: ask all peers v2: permute peers and shareids as in upload, ask next-clockwise peers first (the "A" list), if necessary ask the ones after them, etc. v3: denver airport? Directory/Filesystem Maintenance: *v1: vdrive-based tree of MutableDirectoryNodes, persisted to vdrive's disk no accounts *v2: single-host dirnodes, one tree per user, plus one global mutable space v3: maintain file manifest, delete on remove v3.5: distributed storage for dirnodes v4: figure out accounts, users, quotas, snapshots, versioning, etc Checker/Repairer: *v1: none v1.5: maintain file manifest v2: centralized checker, repair agent v3: nodes also check their own files Storage: *v1: no deletion, one directory per verifierid, no owners of shares, leases never expire *v2: multiple shares per verifierid [zooko] *v3: disk space limits on storage servers -- ticket #34 v4: deletion v5: leases expire, delete expired data on demand, multiple owners per share UI: *v1: readonly webish (nevow, URLs are filepaths) *v2: read/write webish, mkdir, del (files) *v2.5: del (directories) *v3: CLI tool. v3.25: XML-RPC v3.5: XUIL? v4: FUSE -- http://allmydata.org/trac/tahoe/ticket/36 Operations/Deployment/Doc/Free Software/Community: - move this file into the wiki ? back pocket ideas: when nodes are unable to reach storage servers, make a note of it, inform verifier/checker eventually. verifier/checker then puts server under observation or otherwise looks for differences between their self-reported availability and the experiences of others store filetable URI in the first 10 peers that appear after your own nodeid each entry has a sequence number, maybe a timestamp on recovery, find the newest multiple categories of leases: 1: committed leases -- we will not delete these in any case, but will instead tell an uploader that we are full 1a: active leases 1b: in-progress leases (partially filled, not closed, pb connection is currently open) 2: uncommitted leases -- we will delete these in order to make room for new lease requests 2a: interrupted leases (partially filled, not closed, pb connection is currently not open, but they might come back) 2b: expired leases (I'm not sure about the precedence of these last two. Probably deleting expired leases instead of deleting interrupted leases would be okay.) big questions: convergence? peer list maintenance: lots of entries