#3985 new defect

Support "storage plugins" in the Great Black Swamp storage server and client

Reported by: exarkun Owned by:
Priority: normal Milestone: undecided
Component: unknown Version: n/a
Keywords: Cc:
Launchpad Bug:

Description

allmydata.interfaces.IFoolscapStoragePlugin allows third-parties to insert their own code into the network interaction between storage clients and servers.

This interface supports loading code into both storage clients and storage servers. The one existing implementation of this plugin interface does both in order to add additional parameters to some of the server's Foolscap remote methods.

The existing interface was intentionally made Foolscap-specific because (a) there was no other protocol supported at the time and (b) the kind of customization desired involved modifying the network protocol.

We should support the same kind of customizations in a storage server accessed using Great Black Swamp and in a storage client accessing a server using Great Black Swamp. Since the existing plugin interface is Foolscap specific it is likely that this will _not_ involve re-using that interface or any existing plugins for it.

Change History (2)

comment:1 Changed at 2023-03-13T13:52:21Z by exarkun

An obvious approach is to transliterate the existing plugin interface to Great Black Swamp and allow third-parties to implement a new plugin for the new interface. The plugins should have comparable functionality because Great Black Swamp doesn't change much about the _semantics_ of the storage protocol, just the network layer.

A entirely different approach is to leverage the _network layer_ *as* the plugin interface and have third-parties provide this functionality out-of-process instead of as loadable Python modules to execute in-process.

Whereas the former process (for clients) looks something like:

  • Load server configuration
  • Load plugin code
  • Match each server to either the built-in storage client or one of the loaded plugins
  • Use the resulting object to make storage API calls on the server

The process for the latter idea (for clients) would look something like:

  • Load server configuration
  • Match each server using the built-in storage client
  • Use the resulting object to make storage API calls to the server

*plus* a separate third-party workflow that looks something like:

  • Provision a NURL for a locally-running Great Black Swamp "plugin" process
  • Give the storage client node server configuration pointing at this (To give the client node N storage servers, provision N NURLs for the "plugin" process)
  • Implement plugin-specific logic in the "plugin" process (Probably making desired changes and then proxying most operations onwards to some set of real servers)

Advantages of this out-of-process approach:

  • Leave the existing Tahoe-LAFS Python code alone.
    • There is substantial complexity in the current plugin implementation to deal with the quirks of the way a Tahoe-LAFS client node is representation. This could all be avoided.
  • Plugins can be implemented in any language.
    • This might save the work of having to implement each plugin more than once. The only known implementation already wants to support a Haskell client.

Disadvantages:

  • An extra process to manage.
  • Integration with dynamic server discovery (ie, introducers) involves some more work.
    • The introducer protocol is still Foolscap (though we already knew we wanted to change that)

comment:2 Changed at 2023-03-13T14:31:11Z by exarkun

Another advantage of an out-of-process "plugin" is that Tahoe-LAFS basically claims there is no public Python API for the project. The plugin interface itself is effectively public, though - and to actually _implement_ the plugin requires interacting with an arbitrary amount of other Python APIs in the process. An out-of-process "plugin" would have an easily specified interface that is already public - Great Black Swamp - and not have to depend on any Python APIs.

Note: See TracTickets for help on using tickets.