#271 closed enhancement (fixed)

implement new publish/subscribe introduction scheme

Reported by: zooko Owned by: warner
Priority: major Milestone: 0.8.0 (Allmydata 3.0 Beta)
Component: code-network Version: 0.7.0
Keywords: Cc:
Launchpad Bug:

Description (last modified by warner)

Implement the new publish/subscribe introduction scheme we've been discussing recently:

  • enumerate the services which can be published and queried for:
    • upload storage server (ones which will accept new shares)
    • download storage server (ones which will let you read shares)
      • (soon-to-be-decommissioned storage servers will be download-only)
    • helpers and other introducers may be added to this list, but we need to talk about that more first.. I'm not sure about it.
  • all nodes should have an IntroducerClient, as an attribute of the Node instance.
  • to publish a service, do e.g.:
    if self.get_config("offer_storage"):
        ss = StorageServer()
        ss.setServiceParent(self)
        self.introducer.publish(ss, "upload_storage")
        self.introducer.pushing(ss, "download_storage")
  • if the node cares about a particular service, it must register that intent at startup:
    if want_storage_servers:
        self.introducer.subscribe_to("upload_storage")
        self.introducer.subscribe_to("download_storage")
  • then, to access a service, there are two APIs: one that does permutation (for upload/download) and one which just returns a flat list (mostly for the welcome page):
    ppeers = self.introducer.get_permuted_peers("download_storage", storage_index)
    # ppeers is a list of (permuted_peerid, peerid, RemoteReference)
    all_peers = self.introducer.get_peers("upload_storage")
  • add config flags to disable upload, and to disable storage completely. Client installs (i.e. those created by py2exe) will disable storage service by default. Storage-only nodes won't subscribe to hear about other storage nodes.

Other things to think about:

  • get_permuted_peers could return a Deferred (which would make it easier for us to create a special kind of helper which knows about peers for you), or return an iterator, or both, somehow. To actually make this useful is non-trivial (to reduce the memory footprint, you'd want an iterator that yields Deferreds, but that might also impose a stupidly large number of roundtrips to a query). We should probably wait until we identify a need for this before implementing any part of it.
  • This API implies a publish/subscribe model in which the subscription accumulates knowledge about peers, and the actual point of use (i.e. upload or download) samples whatever peers have been acquired by that time. This might not be the best approach.

Change History (7)

comment:1 Changed at 2008-01-11T10:21:40Z by warner

In a separate but related topic, we were talking about the possible utility of different "classes" of introduction: a node could publish some object in one category ("storage servers") and a different object in some other category ("upload helpers").

It occurred to me that it might be useful to have "storage servers for upload" and "storage servers for download" to be separate categories. One use would be a way to deal with the #269 mistake (in which I accidentally caused most of our storage servers to generate new keys and therefore change nodeids). We could resurrect the old nodeids in a different place, and move all their old shares to be served by those nodes, thus making the mutable slots available once more. But we'd like those nodes to only stick around long enough to allow clients to migrate their data onto the real servers, so we'd want to prevent new shares from being uploaded to them. The only tool we have at the moment is to set size_limit=0, but sizes aren't being enforced for mutable slots yet. But, if these "read-only" nodes were published as download storage servers (and *not* upload storage servers), then the upload and download code could use slightly different peersets, and we'd get the desired behavior.

Likewise, if we have a storage server which is scheduled to be decommissioned (say, the hard drive is starting to have soft errors, and we've begun the process of migrating shares off of it but have not yet finished the job), it might be nice to allow it to be available for reading but not accept any new shares. Not being published as an upload server would prevent clients from trying to send shares to it in the most efficient way possible.

comment:2 Changed at 2008-01-21T20:58:37Z by zooko

Rob pointed out that this generalized pubsub mechanism might be a good way to meet upload helpers.

While scrubbing the kitchen floor with Amber on Saturday, I figured out that this might be a good way to meet other introducers, leading to #68 -- "implement distributed introduction, remove Introducer as a single point of failure".

comment:3 Changed at 2008-01-23T02:50:47Z by zooko

merging in #168

comment:4 Changed at 2008-01-23T17:49:29Z by warner

  • Description modified (diff)
  • Summary changed from subscriber-only introducer client to implement new publish/subscribe introduction scheme

Updated summary and description to specify the new introduction scheme we're planning to implement.

comment:5 Changed at 2008-02-02T02:13:59Z by warner

  • Owner changed from zooko to warner
  • Status changed from new to assigned

we've finished the first step, in 7421d99f186ac96d and 3aceb6be1e797e50

  • allow clients to send a hello() to the introducer with my_furl=None to indicate that they do not wish to publish anything
  • if BASEDIR/no_storage is present, do not publish anything

Rob will change the config-wizard (used by the windows installer) to touch this file at config time, and that should be enough to accompish the primary goal: make customer nodes not offer storage servers.

The next step will be to actually split the introducer into separate publish+subscribe methods.

comment:6 Changed at 2008-02-05T19:51:57Z by warner

I've implemented the next step: splitting the introducer into separate publish and subscribe methods. The new introducer is more service-centric: you publish specific services (like "storage") rather than publishing the client as a whole.

This will cause a compatibility bump, so I haven't quite pushed it yet, but I ought to by the end of the day.

comment:7 Changed at 2008-02-06T01:23:57Z by warner

  • Resolution set to fixed
  • Status changed from assigned to closed

changes are pushed, and the test grid has been upgraded. The only remaining issue is what to do with the old introducer-ish functionality of distributing the default encoding parameters, and I'm ok with zooko's suggestion to just leave this out.

closing this ticket, yay!

Note: See TracTickets for help on using tickets.