Tuesday, 21 February 2017
Attendees: warner, meejah, exarkun, dawuud, liz, cypher, daira
- magic-wormhole state machines, using Automat
- IFF (liz, cypher, warner): liz will be having a user-engagement
discussion tomorrow, we should get together later in the week to talk
about it
- 7-min presentation as part of UX session
- may also present at a tool session
- #1382 servers-of-happiness: there's a PR (#402) ready to go,
passes all tests
- other PRs that should be ready:
- #375 (status): minor coverage problems
- #379 (no-Referrer header): [landed]
- #380: just documentation
- close #365: (obsoleted by #402)
- close #131: (obsoleted by #380)
- land #399: (json welcome page)
- land #400: [landed]
- daira will look at #401: (rearranging inotify tests)
- close #396 (list-aliases --readonly-uri) or #400 (one seems
obsolete): (meejah closed #396)
- clean up #226 (whitespace, argument names), then land
- fixing twisted deprecations (twisted.web.client.getPage, mostly in
tests)
- I2P vs foolscap
- warner and exarkun should dive into it
- sshfs vs tahoe
- that bug on IRC, zero-length file
- debug process: first make sure tahoe works, then use an SFTP client.
only then use sshfs (with debug options)
- sshfs tends to ignore close() errors
- tahoe hangs are not good at triggering errors
- removing _auto_deps.py
- for now: "tahoe --version": just show tahoe version, not anything
else
- "tahoe --version-and-path": do full auto_deps double-checks, show
all deps versions too
- rainhill
- next step is probably to refactor tahoe's existing
uploader/downloader into Encoders that accept/produce streams
- want to maintain the don't-write-shares-to-disk property: so output
is a stream, not a filehandle or bytes
- also need to update the diagrams, according to our Summit notes
- Accounting
- ideally want a backend-appropriate way to store the leases
- local disk for shares plus local disk for sqlite is consistent
- S3 for shares but local disk for sqlite is not so much
- when local copy of sqlite db is lost:
- could do an immediate full enumeration of shares
- or only check lazily: if/when someone asks for a share, you check
S3 for it, if not present in DB, update DB and add a starter lease
- or something inbetween
- maybe monthly crawl
- some backends might provide fast enumeration of shares ("ls", get
filenames and sizes and timestamps)
- so crawler might be fast/cheap
- can do both gc and discovery of lost shares with a single crawler,
roughly once a month
- if it finds a backend share without a DB entry, it adds a starter
lease
- if there is a DB entry, but it has no leases, we delete the
backend share
Tuesday, 28 February 2017
Attendees: ramki, jack, liz, warner, meejah, chris, exarkun, dawuud
- pycryptopp binary wheels
- exarkun has PR to replace versioning, another to create linux wheels
- maybe use docker from travis
- needs a flappserver set up to upload them somewhere
- consider moving tahoe from pycryptopp to (pyca) cryptography: post to
mailing list
- Crypto++ is solid, but takes a lot of RAM to compile. Was only
option when we started.
- pyca is active, modern, but currently depends on openssl
- warner will post to mailing list, solicit discussion
- "grid": how to explain?
- mental-model distinction between client, protocol, (set of storage
providers)
- a one-true-grid would remove the confusion, but is of course
somewhat impossible
- IPFS aspires to one-true-grid-ness, and (at least pretends) to let
you not care where your data is stored
- dropbox: company == client == storage provider
- does each grid need a brand name?
- may need different analogies for different audiences: marketing to
consumers, to developers, to businesses?
- tahoe as a protocol, vs tahoe as a program
- managing expectations. "if I download this program and run it, will
I be able to store files?"
- "the public grid": very confusing: implies public visibility of
their files
- need a different term, different adjective
- need to rewrite the "what is tahoe" intro page
- maybe: "with the tahoe software, -in conjunction with the storage
provider of your choice-, you can store stuff"
- "tahoe invite" and "tahoe create-node --join"
- meejah's work to copy introducer and parameters to new client
- ties into accounting; what permissions get conveyed?
- pluggable accounting?
- SPOFs: tor/i2p networks have a starting point or central group of
authorities
- use keybase somehow?
- send introducer/parameters to /keybase/private/me,you
- one-sided vs two-sided invitations: send pubkey up, or send secret
down? or both? Does the inviter need to contact all servers and let
them know about the new client, or do they give a cert/delegation to
the client that all servers will recognize? When a new server is
added, do all clients need to be told that they can use it?
- want caps that can work on other grids
- at least temporarily
- introducer might help with that
- HTTP-based storage server: why?
- maybe useful for static public datasets (scihub, climate data
online): read-only access with distributed publishing, very
IPFS-like
- makes download-side accounting harder: only anonymous-reads
- originally an experiment by warner to investigate
foolscap/new-downloader performance problems
- HTTP-server approaches:
- plain REST-ful PUT, the server errors fast if it's going to reject
- series of smaller PUTs or POSTs, with initial "will you accept
this?" query and final "I'm done now" message
- like S3's multi-step large-upload
- not-at-all-REST-ful one signed POST per request, empty PATH, no
headers. HTTP in name only.
- map foolscap's open/write/write/close to HTTP
- warner originally favored this, to avoid security holes with
signed-headers (which always felt like
security-as-afterthought)
- should probably do:
- one PUT per share upload
- auth header with signed (hash of body, hash of relevant headers)
- exarkun points out this can be made safe by having the server put
a shim in front, which validates the headers and then strips out
anything not covered by them before passing it to the real server
- auth header includes ed25519 pubkey that signed it
- StorageServer gets (pubkey, request details), decides whether to
accept or not
- establishing which pubkey to use, and what authority is has, is
out-of-band, at least not in the PUT request
- acccounting plugins (on both client and server sides) can have a
conversation first, to negotiate, then the server remembers (pubkey1
-> authority X), the client signs their PUT with pubkey1, the server
checks that the request details conform to X
- could maybe work for authorized GET too
- "remote control" tahoe client -meejah
- full client node lives at home
- "lite" client (on phone) talks to the full client over
HTTP-something to upload/download
- like a Helper, but plaintext (over TLS of course)
- memcached frontend idea -meejah
- like FTP/SFTP/WAPI frontends
- would need to store key->filecap table somewhere
- meejah/dawuud have an accounting branch that switches to an async db
api (to enable mysql or AWS cloud-DB or something, not just local
sqlite)
The Tahoe-LAFS Weekly News is published once a week by The Tahoe-LAFS Software
Foundation, President and Treasurer: Peter Secor
. Scribes: Patrick
"marlowe" McDonald
, Zooko Wilcox-O'Hearn , Editor Emeritus:
.
Send your news stories to marlowe@antagonism.org - submission deadline:
Monday night.