January 22, 2019 Nuts & Bolts notes
Jean-Paul Calderone
jean-paul+tahoe-dev at leastauthority.com
Tue Jan 22 19:08:10 UTC 2019
January 22 2019
Jean-Paul, Meejah, Chris, Liz, Corbin, Brian
- Tahoe-LAFS Budget
- Very high-level stuff from the Aspiration contract:
- porting Tahoe and dependent libraries to Python 3
- who can do this? Meejah and others?
- future of foolscap? Python 3 only
- Can we switch to HTTPS?
- How do we preserve security properties of the write-enabler / Foolscap
TubID?
- Self-signed certificates for HTTPS, include the SPKI of the
certificate to ensure client is talking to exactly the desired server
- The plan would be:
- Introduce HTTPS (GBS) alongside Foolscap in version X+1
- Encourage all clients to upgrade to version X+1
- Clients X and X+1 can talk to server X+1
- Drop Foolscap in X+2
- Clients need X+1 or newer to talk to server X+2 or newer
- Port to Python 3 in X+2 or later, avoiding porting Foolscap to Python 3
- Plain HTTP (1.1 at least; maybe also 2 alongside if it proves
sufficiently beneficial) for most of the API; WebSockets or SSE for
possible future server-push or bidirectional communications (eg subscribe
to changes)
- Ask for funds (matching?) from the PSF for porting work:
https://www.python.org/psf/grants/
- Supporting PyPy compatibility (reviewing dependencies, exploring
effort for this)
- Swap pycryptopp out with PyCa:"cryptography"
- for AES, RSA (mutable files), and SPKI measurement of x509
certificates (for switch from foolscap to https)
- improving grid operation/management tools
- community outreach, UI/UX improvements, documentation
- adding new community-requested features, improving garbage collection
- possibly run another summit (or small-grants program)
- Security audit/s of Tahoe-LAFS??
- Distribution process input and transparency (Liz wants to make sure
this happens)
- Other ongoing Tahoe-LAFS work
- Magic-Folders
- Understanding the spec and convincing ourselves it is correct
- Understanding the implementation and convincing ourselves it follows
the spec
- General working with respect to spurious conflicts & backups
- Existing progress reporting API
- GridSync desires for activity reporting APIs
- Ejecting the admin/collective dircap SPOF
- Improving upload speeds
- The macOS support branch/PR
- Corbin's Kubernetes/Tahoe-LAFS work at Matador
- Cloud-native features
- Prometheus-compatible metrics
- 'Farming/Ranching' nodes on k8s
- The state of RAIC
- Abandoning PR 408 ( https://github.com/tahoe-lafs/tahoe-lafs/pull/408 )
- Is funding for any of this work desirable?
- "cloud-backend" branch related things:
- Meejah's config refactoring
- Meejah's async initialization refactoring (needs above)
- "Accounting" in general (cloud-backend, ..) (needs above two)
- possible user-visible feature here: re-loading without re-starting
would be nice
- https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2952
- to clarify: the above two would be required to support re-loading, but
aren't sufficient
random notes:
- foolscap: do we move to HTTP, *then* move to Py3 (thus, no py3 port of
foolscap)
- brian: what about write-enabler? (needs to be bound to channel). TLS
doesn't give us this.
- definitely don't want to buy certs
- self-signed (or something) cert. included in announcement. erase
default list of CAs in the client and populate with JUST the certs you
found from the Introducer (or ... out-of-band)
- getting rid of write-enabler a longer-term goal (and protocol change)
- exarkun: some standard for identifying a certificate, there's some
standard for this (something like SHA256 of .. some fields of the cert?).
- would be nice to have "push" notifications of some kind (so maybe
pub/sub even?)
- if uploads are PUTs:
- there's a sort of "pre-check" thing already
- (can't recall API, but "reserve and ...")
- then multiple PUTs follow with the actual segments
- simpson: shouldn't think "need bidirectional stuff all the time"
- foolscap API is like:
- "i want to upload X stuff"
- here's a reader/writer for the shares
- send "segments" one at a time to the writer
- right now server "has" to seek sometimes (e.g. writes most stuff, but
goes back to update headers?)
- HTTP2 was mentioned (but no websockets there)
- HTTP1, HTTP2: could both be supported on the server
- Python3 port: should be to Python 3.5 (this means we support PyPy
right away, and doesn't limit us to much -- 3.5 includes async-def + await)
- exarkun: should "the project" put together a porting guide? (Twisted
had one)
- brian: whomever is doing the work is probably in best position to
write ^
- porting steps:
- what deps need porting? https://pypi.org/project/caniusepython3/
- (no foolscap)
- HTTP API (so that foolscap can go away)
- unicode vs bytes is always fun for py2->py3
- pycryptopp: can we swap this for something else? (python2-only right
now)
- used for AES bulk encryptiong + RSA encryption for mutable files
- presumably .. something else (cryptography?) can do this stuff
- Brian: other things that are important?
- deployment (can we make that easier?)
- simpson: cloud-native deployment
- mostly important to have config "in one place" / one dir
- ...so "lots of command-line options" doesn't necessarily help
- frustrating: putting storage/other things "in" containers is hard
- frustrating: generation and management of "the configuration" for that
node
- e.g.: bringing up a storage-server: mount the "big storage", then copy
in some secrets, then "sed" it a bunch to let it know some stuff (e.g.
editing tahoe.cfg ...?) .. "only" 10 lines of shell, 20 lines of docker ..
but, still a cost.
- exarkun: "figuring out these lines" is hard (and "did tahoe change
some stuff that made this harder"?)
- "storage backends" aren't maybe as big a thing as we thought? (more
and more it's possible to get the cloud-provider/orchestrator to "just
mount" storage -- e.g. "Azure files" can be mounted in an Azure host)
- brian: can you do simultaneous mounts? ("probably")
- actual use-case: a storage-server can be backed by some cloud storage
- ...so "direct access" to that storage is "just" an optimization
- the most-interesting "API" here is "containers and mounts", not e.g.
*direct* support for S3 in Tahoe.
- brian: where's "Accounting" at?
- ...what features does it enable?
- someone is already paying for storage
- can we have "an accounting-aware" storage-servers with a facade that
speaks "the old API"
- "more-incremental" is better
- examples of billing: access to grid is free, but pay for "value add"
services on top (because: no way to account for usage)
- example feature: can limit usage of participants
- "too cheap to meter"
- Accounting adds identity whereas Tahoe-LAFS has had no such concept
before
- exarkun: would be nice if servers logged *everything* that they can
from the clients (because: a malicious server could do this) and so this
will make us/everyone more aware of shortcomings etc
- (although Accounting + Identities might sound scary: we can probably
already figure this out, but just .. don't)
- simpson: grid-manager is interesting for his use-cases (to make the
offering more-consistent, e.g. assuring customers they're uploading to the
"managed storage servers")
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/pipermail/tahoe-dev/attachments/20190122/8f426bb6/attachment-0001.html>
More information about the tahoe-dev
mailing list