[tahoe-lafs-weekly-news] TWN 72

Patrick R McDonald marlowe at antagonism.org
Tue Feb 21 01:10:24 UTC 2017


e-LAFS Weekly News, issue number 72, February 20 2017
=========================================================

Welcome to the Tahoe-LAFS Weekly News (TWN).  Tahoe-LAFS_ is a secure,
distributed storage system. `View TWN on the web`_ *or* `subscribe to
TWN`_.
If you would like to view the "new and improved" TWN, complete with pictures;
please take a `look`_.

.. _Tahoe-LAFS: https://Tahoe-LAFS.org
.. _View TWN on the web:
  https://Tahoe-LAFS.org/trac/Tahoe-LAFS/wiki/TahoeLAFSWeeklyNews
.. _subscribe to TWN:
  https://Tahoe-LAFS.org/cgi-bin/mailman/listinfo/Tahoe-LAFS-weekly-news
.. _look: https://Tahoe-LAFS.org/~marlowe/TWN72.html

ANNOUNCING Tahoe, the Least-Authority File Store, v1.12.1
=========================================================

On behalf of the entire team, I'm pleased to announce the 1.12.1 release
of Tahoe-LAFS.

Tahoe-LAFS is a reliable encrypted decentralized storage system, with
"provider independent security", meaning that not even the operators of
your storage servers can read or alter your data without your consent.
See http://Tahoe-LAFS.readthedocs.org/en/latest/about.html for a
one-page explanation of its unique security and fault-tolerance
properties.

With Tahoe-LAFS, you distribute your data across multiple servers. Even
if some of the servers fail or are taken over by an attacker, the entire
file store continues to function correctly, preserving your privacy and
security. You can easily share specific files and directories with other
people.

The 1.12.1 code is available from the usual places:

* pip install Tahoe-LAFS
* https://github.com/Tahoe-LAFS/Tahoe-LAFS

  * tag: "Tahoe-LAFS-1.12.1"
  * commit SHA1: ce47f6aaee952bfc9872458355533af6afefa481
* https://Tahoe-LAFS.org/downloads/ (and SHA256 hashes)

  * Tahoe-LAFS-1.12.1.tar.gz
    327b364a702df515fd329d49f052db0fcbf468e20c26d1f8df819f54786ca0ce

  * tahoe_lafs-1.12.1-py2-none-any.whl
    070d2a4c4ea220863ff078e8032c01746572e5513c4ac26dc3de4bd03f6d25c1

  * detached GPG signatures (.asc) are present for each file

All tarballs, and the Git release tag, are signed by the Tahoe-LAFS
Release Signing Key (fingerprint E34E 62D0 6D0E 69CF CA41 79FF BDE0 D31D
6866 6A7A), available for download from
https://Tahoe-LAFS.org/downloads/tahoe-release-signing-gpg-key.asc

Full installation instructions are available at:

 http://Tahoe-LAFS.readthedocs.io/en/Tahoe-LAFS-1.12.1/INSTALL.html

1.12.1 fixes a few small things from the 1.12.0 release: the
multiple-introducers ("introducers.yaml") feature was completely broken,
creating nodes with --hide-ip on I2P-only systems should not set "tcp =
tor", and at least one --listen=I2P problem was fixed. Please see the
NEWS file for details:

 https://github.com/Tahoe-LAFS/Tahoe-LAFS/blob/ce47f6aaee952bfc9872458355533af6afefa481/NEWS.rst

Many thanks to Least Authority Enterprises for sponsoring developer time
and contributing of the new Magic Folders feature.

This is the seventeenth release of Tahoe-LAFS to be created solely as a
labor of love by volunteers. Thank you very much to the team of "hackers
in the public interest" who make Tahoe-LAFS possible. Contributors are
always welcome to join us at https://Tahoe-LAFS.org/ and
https://github.com/Tahoe-LAFS/Tahoe-LAFS .

Brian Warner
on behalf of the Tahoe-LAFS team

January 18, 2017
San Francisco, California, USA

Mailing List
============


Devchat
==============

Tuesday, 10 January 2017
------------------------
Attendees: warner, meejah, liz, jp, str4d, daira, cypher, dawuud

* `Debian`_ freeze is happening soon, trying to fix critical bugs and
  make a new release in the next week or so
* `I2P`_ bugs found at `CCC`_

  * `#2861`_: negotiation failure when I2P client connects to I2P
    server

    * str4d and warner to pair this time tomorrow
  * `#2858`_: I2P provider/handler

    * PR33 on foolscap
  * `#2859`_ can be closed
* `Tor`_ blog post has a comment about `#2862`_ (introducers.yaml docs
  syntax failure
* extras_require on win32 (`#2763`_) may require newer pip, check Debian
  to see if it's in place
  * potential problem is that "pip install Tahoe-LAFS" on Debian won't
    work
  * but Debian packaging of Tahoe-LAFS would probably be ok
  * pip docs say platform= is supported since pip-6
* what version of twisted will go into the Debian freeze (`#2857`_)
* cloud-backend: warner should look at 2237.cloud-backend-merge.0

  * that is cloud-backend, with master merged in, with some additional
    meejah commits on top
  * still a few unit tests that fail
  * daira/meejah will do some rebase/rewriting work, merge master into
    it again
  * warner will treat that branch as a resource to mine diffs from,
    will review some diffs and apply to master
  * then new master will be merged into the branch again
  * over time the cloud-backend branch diff will shrink
  * big changes/refactoring

    * lots of APIs went from sync to async
    * so tests got harder
  * exarkun cautions against Mock and inlineCallbacks

    * see txaws tests: txaws.supplied.s3fake , returns pre-resolved
      Deferreds
    * use TestCase.successResultOf/failureResultOf instead of
      inlineCallbacks
    * mock leads you to tests that know too much about the internals
      of the implementation, and are fragile
    * instead, write second simple implementation of your real
      interface, which only operates on local memory

      * e.g. named "MemoryXYZ" instead of "XYZ"
    * write tests against that interface
    * tests can run against either implementation
    * one test runs against both, and uses external dependencies, etc,
      with async and inlineCallbacks

      * exercises XYZ specifically
    * other tests only use the simple implementation, and are
      synchronous

      * merely uses XYZ
* what people are interested in

  * meejah: servers of happiness
  * dawuud: generic accounting plugin api
* should tahoe be all "plumbing"?

  * command plugin mechanism like git/hg/twisted.plugins
  * should we add more stuff to tahoe itself, or to apps on top?
  * tahoe as library
  * application targets

    * things that need a key-value store
    * can we shape tahoe into a way that is suitable for those
      applications?
    * could we plug into `Slack`_? `LibreOffice`_ "save to my tahoe
      grid" option?
    * `Spideroak`_/`Semaphor`_ storage backend?
    * `Signal`_? `Wire`_? file transfer backend
* when should the next summit be?

  * maybe before/after `PyCon`_ in May? Portland
  * tor-dev in Montreal in September?
  * `IFF`_ in Spain in March?
  * After `CCC`_ before `RWC`_ in Jan 2018
* revocable immutable files? meejah

  * ask server to re-encrypt ciphertext with a stream cipher, append
    new key to an encrypted list, issue new readcaps

.. _`#2861`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2861
.. _`#2858`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2858
.. _`#2859`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2859
.. _`#2862`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2862
.. _`#2763`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2763
.. _`#2857`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2857
.. _`Debian`: https://www.debian.org
.. _`Tor`: https://torproject.org
.. _`CCC`: https://en.wikipedia.org/wiki/Chaos_Communication_Congress
.. _`I2P`: https://geti2p.net/en/
.. _`Slack`: https://www.slack.com
.. _`LibreOffice`: https://www.libreoffice.org/
.. _`Spideroak`: https://spideroak.com/
.. _`Semaphor`: https://spideroak.com/personal/semaphor
.. _`Signal`: https://whispersystems.org/
.. _`Wire`: https://wire.com/en/
.. _`PyCon`: https://us.pycon.org/2017/
.. _`IFF`: https://internetfreedomfestival.org/
.. _`RWC`: https://www.realworldcrypto.com/

Tuesday, 24 January 2017
------------------------
Attendees: ramki, warner, meejah, exarkun, dawuud, cypher

* 1.12.1 landed in Debian unstable, will migrate to testing 29-jan (with
  Foolscap)

  * https://tracker.Debian.org/pkg/Tahoe-LAFS
  * "tahoe --version" has (cosmetic) warnings about missing dependencies
  * artifact of Debian's installed packages using .egg-info/ with
    empty requires.txt
* `#1382`_ (servers-of-happiness): dawuud and meejah have been refactoring
  the branch, improving the tests

  * made small change to spec, server with no space is treated as
    read-only
  * the actual spec-change is in 17c562129d464e000a3a7e0d14b4d751bf3be0e6
  * In step 6, "Let an edge exist between server S and share T if and
    only if S already has T, or *could* hold T (i.e. S has enough
    available space to hold a share of at least T's size)." -> becomes
    "Construct a bipartite graph G3 of (only readwrite) servers to shares
    (some shares may already exist on a server)."
  * there's some code duplication that could be refactored, maybe in the
    future, in util/happinessutil.py
* `#2861`_ (I2P vs TLS) is still broken, warner needs some time with
  str4d and wireshark to debug
* JSON welcome page (`#2476`_): dawuud is getting ready to land

  * provides several pieces: introducer info, other-servers info,
    my-storage-server status info, versions
  * should we make separate child URL paths for the individual pieces?
  * naw, just one big GET /?t=json
  * future new-WAPI can provide something more civilized
* cloud-backend: why does it need accounting?

  * warner: backgrounder on starter-leases, transition to leasedb-capable
    version, bootstrap after loss of leasedb
  * exarkun: could we move the leases-database out to the cloud backend,
    remove local mutable state from server
  * Amazon RDS?
* direct-to-S3 mode: maybe give up accounting/GC in that case

  * would still be useful for some personal use cases
  * super hard to do strong accounting (with adversarial clients) without
    a real server
  * hacky IAM roles? eww
* leases in S3 as files with one-line-per-SI? (plus accounting
  identifier, expiration time)

  * occasional fetch, populate sqlite, run query, forget sqlite
  * sometimes delete the S3 files when they've expired
  * server writes these files on behalf of identified clients
* or leases as files in tahoe itself?

  * one account per directory, so only one writer, so no conflicts
  * written by storage server, not by clients
  * S3 holds (tahoe) SERVERNAME/leases/CLIENTID/FILES
* spectrum of options

  * 1: store lease info in non-durable efficiently-queryable location
    (not on backend): sqlite
  * 2: keep two copies, try to keep in sync

    * maybe just write whole .sqlite file into S3 after each change
  * 3: fetch and build ephemeral DB when you want to make queries, then
    throw it away

    * canonical lease table is stored as loose files in S3, occasionally
      pruned
  * 4: store info in clever loose backend way, but queries will probably
    be expensive
  * would look a lot like the old local-crawler approach, but with files
    in S3, async reads
* meejah's cloud-backend branch is still the right one to mine patches
  from (2237.cloud-backend-merge.0)
* make a new 'lafs' CLI command? with cleaner subcommand tree?

  * leave 'tahoe' as plumbing, use 'lafs' as porcelain?
  * plugins?
* adding an attenuate/diminish CLI command (to get from writecap to
  readcap)?

.. _`#1382`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/1382
.. _`#2476`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2476
.. _`#2816`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2816

Tuesday, 31 January 2017
------------------------

Attendees: warner, str4d, meejah, daira

We spent the whole time investigating `#2861`_ (an SSL handshake failure
when using I2P on 1.12).

The root cause was found to be txI2P's unusual approach to server
connections, coupled with Twisted's TLS handling.

Most protocols (TCP, Tor) receive inbound connections by listening on a
TCP socket, and then accepting connections (either from the real client,
for TCP, or from the local Tor daemon). I2P is an exception, because the
Tahoe server makes an *outbound* connection to the I2P daemon, then asks
the daemon to use that TCP link for *inbound* I2P connections.

Twisted uses the type of the underlying connection (outbound client, or
inbound server) to decide which kind of TLS handshake it should emit: a
ClientHello, or a ServerHello. TLS requires exactly one side to send a
ClientHello, after which the other side sends the matching ServerHello.
When both sides are using client-like connections, both sides send a
ClientHello, and the TLS negotiation fails.

We're trying to figure out the cleanest way to fix this. It might be to
patch Twisted to add a new argument to the startTLS() call (probably
"side=", so you could explicitly request either client or server, and
ignore the underlying connection type). We'd then make a corresponding
change to Foolscap, wait for the next Twisted release, and bump the
dependencies.

Or it might be easier to change Foolscap's TLS handling, to switch to
TLS in a different way, that would give us more control over the
handshake side it uses (in short, switch from startTLS to direct use of
TLSMemoryBIOProtocol). That wouldn't require any changes to Twisted,
just a new version of Foolscap, but would probably be more work.

Feel free to follow along on
https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2861 for the details.

Other work that's ongoing:

* ramki got tahoe 1.12.1 into Debian (sid) in plenty of time to make
  the Stretch freeze. 1.12.1 is now in "testing", and everything in
  "testing" will be frozen for the Stretch release on 05-Feb. If you use
  Debian (sid or testing), please "apt install Tahoe-LAFS" and make sure
  everything works as expected. We know of two problems right now:
  I2P doesn't work (see above, but it doesn't matter quite so much
  because I2P isn't packaged in Debian yet), and "tahoe
  --version" emits a scary-looking but benign warning about dependency
  versions.
* meejah and dawuud have been working hard at bringing `#1382`_
  (servers-of-happiness, server-selection cleanups) up to date, so
  hopefully we can land it soon
* meejah has also been working on `#2237`_ (cloud-backend), and I think
  the next step will be to incrementally land changes from that branch
  on trunk, then merging from master back into the branch until it
  shinks away into nothing. Basically mining the branch for patches in
  an order that makes review and merging easier to manage.
* there are a couple of other PRs on github that should be landable
  without too much work

.. _`#2237`: https://Tahoe-LAFS.org/trac/Tahoe-LAFS/ticket/2237

Tuesday, 07 February 2017
-------------------------

Attendees: warner, liz, dawuud, meejah, exarkun

* dawuud and meejah are rewriting the #1382 severs-of-happiness branch

  * markberger did a clever set-rearrangement thing, makes it run much
    faster than the algorithm we wrote up at the summit
  * they're rewriting his code as functions, bringing it up to date with
    our coding standards
* accounting for S4-like services (shares on S3, tahoe server on EC2)

  * need to reconstruct lease data without EC2 state
  * could store lease data in small files next to shares, keep sqlite
    cache on EC2 box, rebuild when necessary
  * make it part of the pluggable storage backend
  * should it be an external command? or a built-in do it automatically
    at startup if the DB is missing?

    * it will take a while: must fetch all lease records from S3
  * not just leases: also accounts and account attributes
  * just dump whole .sqlite file into S3?
  * backend should be responsible for this: could choose to use a cloud
    DB service

    * maybe add an exception type for backends to raise during setup that
      means "please tell the operator to run a recovery command"
    * for upgrades and recovery
    * backend also has the option to do recovery automatically
  * Accountant is shared, but its state is stored in a backend-specific
    way
  * S3 has "immediate consistency" for reading new objects that were not
    read before being created

    * and eventual consistency for everything else
    * so try to avoid modifying shares
* goal is to allow a copy of .tahoe to serve as a backup

  * node can modify the contents for a few seconds after startup, but
    should then stop
  * should not require continuous backup of .tahoe
  * exarkun points out that it'd be better to be able to have a "tahoe
    init" command

    * All the state that is part of the node identity is created by this
      step
    * If you backup .tahoe after running this, you can always reconstitute
      the same node from that backup
* "node state storage subsystem"

  * not just shares
  * accounting info, runtime-discovered config data
* exarkun thinks about storing this in a DB for analytics
* could we use tahoe to store its own config/state?

  * worried about performance
  * would introduce extra dependencies: server A would depend upon server
    B for its own state
* maybe just use storage-backend for it

  * SI=accounting-thing-1
  * account manager asks share storage backend to write data to a
    known-SI
  * needs to be encrypted

    * general principle: protect server against its own storage backend
    * part of .tahoe/private/ is a key that encrypts that data
    * refactor file encoding/decoding code to be able to use it locally

      * "please encrypt this (state thing), one share only"
      * then turns around to write the ciphertext into the storage backend
  * goal is for .tahoe/private to be snapshotted once, right after
    startup, and that should be sufficient as a backup
* storing things in different ways depending upon how fast they happen

  * "low rate": node init
  * "medium rate": accounting changes: Alice is given permission to
    write, etc
  * "high rate": shares being modified
  * we're willing to make the operator do a backup of .tahoe/ for
    low-rate changes
  * willing to make S3 writes of databases/etc for medium-rate, but not
    high-rate
  * must be willing to make S3 writes (of flat data) for high-rate (share
    changes)
* maybe deployment makes a decision

  * speed of local sqlite, but not persistent
  * security of local sqlite: not exposed to other cloud users
  * persistence (but low-performance) of S3-stashed .sqlite files
  * persistence (but low-security) of real AWS cloud -DB
* starting point: add API to storage server for "local data"
  (config/private/etc)

  * must be async
  * code in Client or Node that uses self.write_private_config could be
    changed to use Client.write_something, which delegates to new
    StorageServer API
* related to replacing tahoe.cfg with tahoe.sqlite

  * must write "tahoe config" CLI command
  * lose ability to edit config with text editor
  * most users don't want to use a text editor
  * you can instruct someone to copy+paste a CLI command, but not
    instructions for a text editor
* talking about Petmail, Vuvuzela, rerandomizable tokens
* I2P problem

  * both sides were being TLS clients. TLS requires one client and one
    server.
  * could change Twisted's .startTLS() api to let you specify the side
  * or could change Foolscap to wrap the underlying protocol itself

    * this would enable Foolscap-over-X, where X is an ITransport but not
      real TCP
    * maybe some new protocol that's implemented in pure Twisted, rather
      than TCP to a local daemon
* http://www.lothar.com/blog/55-Git-over-Tahoe-LAFS/

The Tahoe-LAFS Weekly News is published once a week by The Tahoe-LAFS Software
Foundation, President and Treasurer: Peter Secor |peter|. Scribes: Patrick
"marlowe" McDonald |marlowe|, Zooko Wilcox-O'Hearn , Editor Emeritus:
|zooko|.

Send your news stories to `marlowe at antagonism.org`_ - submission deadline:
Monday night.

.. _`marlowe at antagonism.org`: mailto:marlowe at antagonism.org
.. |peter| image:: psecor.jpg
   :height: 35
   :alt: peter
   :target: http://Tahoe-LAFS.org/trac/Tahoe-LAFS/wiki/AboutUs
.. |marlowe| image:: marlowe-x75-bw.jpg
   :height: 35
   :alt: marlowe
   :target: http://Tahoe-LAFS.org/trac/Tahoe-LAFS/wiki/AboutUs
.. |zooko| image:: zooko.png
   :height: 35
   :alt: zooko
   :target: http://Tahoe-LAFS.org/trac/Tahoe-LAFS/wiki/AboutUs


More information about the tahoe-lafs-weekly-news mailing list