devchat notes from previous meeting (04-Oct-2016)

Tue Oct 18 21:32:38 UTC 2016

(sorry I forgot to post these after the meeting, but better late than
never)

Tahoe-LAFS weekly devchat 04-Oct-2016

attendees: brian, meejah, liz, dawuud, CcxCZ (via pad-notes), zooko,
daira, leif

  * ask Brian Warner what he thinks about multi-magic-folder per tahoe
client feature:
        https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2792
  * 2490 update:
    * 2490-tor: works with released txtorcon, but bug for
unix-control-port (fixed on txtorcon master)
    * so with next minor release of txtorcon, branch should work for
both unix and tcp control port
    * tor may have a built-in 30s delay before it publishes any
descriptors at all
    * chutney? put in integration tests?
      * unit-tests mock out all tor
      * mocks txtorcon's actual-launch (for unit-test)
      * used to have an "existing grid" that a new integration-test-ee
would connect to and download some things that "should be there"
    * meejah says "probably a few hours", and if that's the price it's
probably good
    * yes, put in "tox -e integration" stuff (but brian says in
different target, like "tox -e integration-tor" or something)
    * note that tor's bwscanner has stuff that succesfully runs Chutney
on Travis-CI.org
    * make it optional, somehow (pytest.mark? something else?)
    * can we "pip install" Chutney? no. must use git clone ...
    * who should "own" the tor process (between Tahoe and Foolscap)
      * "tor_provider" should own it
      * it always just tells Foolscap the control-port
      * (that is, never use the "launched tor" Foolscap handler)
      * of course, we only ever want ONE tor launched so we need to be
careful to tell Foolscap (brian: hence the "control_port_maker" or
whatever API)
    * brian: wants better status-updates stuff
    * meejah: will release minor txtorcon for control-port fix (ASAP)
    * meejah: will play around with txtorcon-1.0 vis-a-vis Foolscap + Tahoe
      * needs/needed to add "control_port=" kwarg to txtorcon.launch()
    * bikesheds:
      * "tor files" in private/
      * separate top-level things?
      * all under ./tor
      * data_directory to Tor should be it's own thing
      * so:
        * private/tor/state
        * private/tor/control.socket
    * options: should the all start with --tor-*?
      * --tor-launch is the only weird one (instead of --launch-tor)
    * warner will squash/rebase 2490 and land, hopefully today or tomorrow
    * meejah needs to release txtorcon-0.17.0 or whatever for ^
  * cloud-backend update:
    * unlikely that daira will get this landed "soon" (i.e. before 1.12)
  * foolscap connection-status work: maybe land in the next few days
  * 2829 (configurable magic-folder polling interval): dawuud will take
a look
  * tahoe-summit: Nov8+9 in SF
    * sounds like liz, zooko, david, meejah, daira, brian will come
    * "mechanics institute library": possible venue
    * facilitation? (e.g. "gunner")
    * what are our goals? what questions do we want to answer?
    *
  * anything else for 1.12?
    * adjust/config polling-interval:
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2829
    * david will look at ^ for now (at least a simple config option for now)
  * slackware buildslave has a failing (magic-folder) test. To do with
Unicode/"or something". #2834

  * "jealous" about IPFS where it's just "out there" (i.e. *not* tied to
a specific Tahoe grid)
  * scaling:
    * is using more than "a few hundred" servers ever going to be
useful? (As far as uploading)
    * zooko: any sort of "global grid" *needs* accounting
    * accounting as stepping-stone/launchpad for global grid
    * what if every computer was a Tahoe server, and we could pay them all
      * also, some magic reputation thing (preventing sockpuppets/Sybils
etc)
      * "who are my downloaders" -> will upload with "customers" in
mind? (i.e. "me", "my friends", "publish to the world")
      * given ^, find some set of storage servers that a) meet those
needs, and b) are as cheap as possible
      * don't need to re-compute that set of servers for many things
      * zooko questions relevance of download as input to choosing whom
to upload to
        * if "close servers" is most-relevant, then you might really
want close-by (geographic) servers
      * leif: reputation? (yes, relevant)
      * hard to have reliable reputation-system, but point here is what
*would* we do even if we had that
      * ditto payment system (i.e. probably 'hard' to do, but what if we
had one that "just worked")
      * brian's thesis: you'd tend to stick to "one" set of servers for
most of your own use (i.e. it wouldn't change very often?)
        * 3 categories:
          * 1: "won't" use any commercial services (own machines, or
friends')
          * 2: users who don't care -- will use whatever, and will pay
for it (as long as service good) so e.g. S3 likely candidate
          * 3: people that refuse to pay (aka "freeloaders")
          * leif: 4th category -- will pay "companies" but won't pay
just one (i.e. want to distribute their data)
            * brian: this is basically 2 above (i.e. but with "want
distributed" as additional want)
        * accounting should serve category 1 and 2 as highest-priority
        * still useful to say we have "clumps" or "clubs" of servers
        * "lafs.club" as URI
        * define a "storage club". someone "owns" it. each "club id"
maps to a set of servers (that's mutable over time, e.g. by owner
publishing new roster)
        * when you upload/download you say what "club" it's for/in
        * zooko: might need another layer to meet "giant scale" world
tahoe-grid (broker/search/...)
        * zooko: success here has to come from setting out on the
"agoric" path without knowing the destination
        * brian: mojo-nation-like download incentives need "independant"
lookup for each file (i.e. *not* the storage-club thing above)
        * zooko: originally bittorrent was deployed kind-of like this
(e.g. without trackers) and then people made trackers etc.
        * one extreme, "strict separation" (cap gives no location
informations/hints) versus other extreme, "global grid" which
incorporates location-information along-with the read-cap
        * zooko arguing for something that's a little further along, but
not necessarily the "will scale globally" thing
        * some func(read-cap, all-servers) -> list of useful servers
(but probably "active")
        * func(read-cap, some-servers) -> list of servesr (but
completely "passive")
        * club thing is: func(read-cap, all-servers) -> list-of-servers
(but "mostly" passive, based on storage-club hint)
        * brain thinks there's a "server selection framework" branch
          * sort-of answers the "plugin" stuff
          * meejah: being able to play with these different ideas easily
is useful
          * "StorageFarmBroker" is probably the place to have some of this
        * can IPFS scale? (leif, brian) [how does their DHT work?]
        * mojo-nation: split files into hundreds or thousands of blocks
(because: fixed block size)
        * tahoe: "a few dozen" shares is plenty, which greatly reduces ^
problem of contacting lots of places
        * economic vs. "policy"-type reasons for using different servers
    * zooko thinks they've had this conversation several times before
      * different models for economic/micropayments to servers
      * e.g. "i'll pay you X if i can download this share again in a month"
    * basically: zooko wants incentives in the system, and "too many"
servers in the system before we try to "fix" that problem
    * i.e. make it "fail from too much success", because then we'll know
more about what works and what doesn't
    * how soon does it fail if we use Etherium or BitCoin for paying
servers?
      * about 5 cents (transaction fees kill it)
      * so that's about 5GiB-months on S3
      * so it fails right away (i.e. can't do bitcoin/ethereum
transactions per-upload/download)
    * another model around ETH/BTC is: establish a relationship with a
server (i.e. make an account, deposit, etc -- and then the actual
upload/download per-transaction thing just deducts from that)
    * level 1: the above ($/transaction), which fails out-of-the-box
    * level 2: fails when you need to have relationships with more
servers than is "reasonable" per month
      * so, say $1/month. So you can have relationships with ~20 servers
per month
      * still using permuted-ring thing to decide "what servers" (so
after ~40 servers they can't upload because they won't have
relationships with the right servers)
    * level 3: level 2, but you inject something into the permuted-ring
so that you only contact servers you've got relationships
      * but how does a downloader find the right servers?
      * alice uploads to e.g. 20/100 servers
      * bob does the same thing (but has chosen a possibly-different 20
servers)
      * carol downloads a readcap she got from alice: how does she find
the right servers?
        * eventually she'll contact all the servers, and there's only
100 so not really a problem
        * brian: but (critically) only if they respond to "yes, no i
have that share" for free
        * if queries are not-free, she'll blow her budget (either
downloading from alice, or bob)
      * ...so this *might* start to break down
    * so, what if we *did* deploy the above, and we got to >50 servers
and it started to break down. how to fix?
      * so then maybe people deploy helper-like things that agree to
download all the shares you need
      * daira: single point of failure? zooko: 20 points of failure; you
can have relationship w/ 20 servers / month
    * back to storage-club-ish ideas
    * talking about servers that know how to fetch "every" query (by
themselves having relationships with other servers)
    * zooko doesn't want to ever advertise servers differently
    * also abuse issues: server says "sure, i have that share" and then
once you put in the deposit it doesn't
    * daira: contingent payments
    * IPFS, maidsafe, SIA
      * how many nodes do these have?
    * zooko enthusiastic about "lets get to 100" and see what starts
breaking
    * brian sad that the break-age model is when "too many" people want
to help out!
      * i.e. if there's 200 servers, you have to spend more per-month to
maintain server relationships
    * what about: let storage servers innovate on how they answer
queries etc.
    * ^ needs feedback loop where "good-er" servers get better
reputation, more money, etc
    * might be many reasons people run servers
    * lessons from cryptocurrency miners: power bills vs. what they get,
if they're in the red after a month or whatever, they turn it off
      * (that's the only switch they have, though)
    * zooko likes simple knobs: like "you can only spend X on backend"?
    * friend-net ties into storage-clubs: i.e. what if you're providing
storage because your friends are providing you storage
    * alice coin; alice.bob coin; alice.bob.carol coin -> support
friend-net, *but* the only API is the economic-thing
    * [just brian, meejah now]
    * current accounting branch labels clients with pub/priv keypairs
      * ties in to leasedb
      * can look up how much space each public-key uses etc
      * also had a dictionary of things like "here's what i can accept"
(bitcoin, friends, etc, mystery-coin)
      * url + explanation that you can show to the user (e.g. "what is
mystery-coin")
      * also e.g. for "your human has to X so i can store shares for you"
        * where X might be "send bitcoin to <address>"
        * your friend Carol has to add your public-key to their list,
send email to <address>
      * the above was "per storage server connection"
      * meejah thinks we need an API that's "per share upload or
download request" to satisfy the use-cases we've talked about
      * there are "anonymous" account-objects, or fURLs for
"not-anonymous" account-objects
      * above stuff was mostly about "how to upgrade to a not-anonymous"
account-object or whatever
    * brian started with all this stuff with "invitation codes".
"friendship bread"?