[tahoe-dev] [ANN] Foolscap-0.4.2 released

Brian Warner warner at lothar.com
Tue Jun 23 19:03:23 PDT 2009


I've recently released Foolscap-0.4.2 (and noticed that I hadn't sent
out a release announcement since 0.2.8 over a year ago, despite there
being 2 major and 4 minor releases in that time. my bad).

The home page is http://foolscap.lothar.com/trac . Download this release from
PyPI, or from http://foolscap.lothar.com/releases/foolscap-0.4.2.tar.gz .

This release adds the "Foolscap Application Server", an easy way to configure
and deploy various foolscap-based services like "upload-file" and
"run-command". Think of it as a twistd for foolscap. The demo directory
includes code to use a FURL to checkout and commit to a Git repository: it's
easy to set up, and the client holding this FURL gets no other access to the
server machine. The full release notes (since 0.2.8) are attached below.

Since I missed the last few releases, I'll mention some other features from
the past year. The 0.4.0 series changes the preferred access point to
"foolscap.api", begins to give up on the resource-consumption-attacks
prevention code, and fixes compatibility with python-2.6. The 0.3.0 series
adds an "Incident Gatherer" service to the logging code, and improves error
handling (DeadReferenceError and RemoteException).


Foolscap is a Twisted-friendly remote object protocol, a descendant of
Perspective Broker, with improved security properties, third-party
references, adaptable serialization, remote logging, and other useful
features. Please visit http://foolscap.lothar.com/trac for more details.


have a capable day,
 -Brian




* Release 0.4.2 (16 Jun 2009)

** Compatibility

Same as 0.4.1

** the Foolscap Application Server

The big new feature in this release is the "Foolscap Application Server".
This is both a demo of what you can do with Foolscap, and an easy way to
deploy a few simple services that run over secure connections. You create and
start a "flappserver" on one machine, and then use the new "flappclient" on
the other side. The server can contain multiple services, each with a
separate FURL. You give the client a FURL for a specific service, it
connects, does a job, and shuts down.

See doc/flappserver.xhtml for details.

Two service types are provided in this release. The first is a simple
file-uploader: the holder of the FURL gets to upload arbitrary files into a
specific target directory, and nowhere else. The second is a pre-configured
command runner: the service is configured with a shell command, and the
client gets to make it run (but doesn't get to influence anything about what
gets run). The run-command service defaults to sending stdout/stderr/exitcode
to the client program, which will behave as if it were the command being run
(stdout and stderr appear at right time, and it exits with the same
exitcode). The service can be configured to accept stdin, or to turn off
stdout or stderr. The service always runs in a preconfigured working
directory.

To do this with SSH, you'd need to create a new keypair, then set up an
authorized_keys entry to limit that pubkey to a single command, and hope that
environment variables and the working directory don't cause any surprises.
Implementing the fixed-directory file-uploader would probably require a
specialized helper program.

The flappserver provides an easy-to-configure capability-based replacement
those sorts of SSH setups. The first use-case is to allow buildslaves to
upload newly-created debian packages to a central repository and then trigger
a package-index rebuild script. By using FURLs instead of raw SSH keys, the
buildslaves will be unable to affect any .debs in other directories, or any
other files on the repository host, nor will they be able to run arbitrary
commands on that host. By storing the FURLs in a file and using the
--furlfile argument to "flappclient", a buildbot transcript of the upload
step will not leak the upload authority.

** new RemoteReference APIs

RemoteReference now features two new methods. rref.isConnected() returns a
boolean, True if the remote connection is currently live, False if it has
been lost. This is an immediate form of the rref.notifyOnDisconnect()
callback-registration mechanism, and can make certain types of
publish-subscribe code easier to write.


The second is rref.getLocationHints(), which returns a list of location hints
as advertised by the host Tub. Most hints are a ("ipv4",host,portnum) tuple,
but other types may be defined in the future. Note that this is derived from
the FURL that each Tub sends with its my-reference sequence (i.e. it is
entirely controlled by the Tub in which that Referenceable lives), so
getLocationHints() is quite distinct from rref.getPeer() (which returns an
IPv4Address or LoopbackAddress instance describing the other end of the
actual network connection). getLocationHints() indicates what the other Tub
wants you to use for new connections, getPeer() indicates what was used for
the existing connection (which might not accept new connections due to NAT or
proxy issues).

getLocationHints() is meant to make it easier to write connection-status
display code, for example in a server which holds connections to a number of
peers. A status web page can loop over the peer RemoteReferences and display
location information for each one without needing to look deep inside the
hidden RemoteReferenceTracker instance to find it.

** giving up on resource-consumtion defenses

Ticket #127 contains more detail, but beginning with this release, Foolscap
will be slowly removing the code that attempted to prevent memory-exhaustion
attacks. Doing this in a single process is just too hard, and the limits that
were enforced provided more problems than protection. To this end, an
internal 200-byte limit on FURL length (applied in Gifts) has been removed.
Later releases will remove more code, hopefully simplifying the deserization
path.

** other bugfixes

Previous releases would throw an immediate exception when Tub.getReference()
or Tub.connectTo() was called with an unreachable FURL (one with a corrupted
or empty set of location hints). In code which walks a list of FURLs and
tries to initiate connections to all of them, this synchronous exception
would bypass all FURLs beyond the troublesome one.

This has been improved: Tub.getReference() now always returns a Deferred,
even if the connection is doomed to fail because of a bad FURL. These
problems are now indicated by a Deferred that errbacks instead of a
synchronous exception.


* Release 0.4.1 (22 May 2009)

** Compatibility

Same as 0.4.0

** Bug fixes

The new RemoteException class was not stringifiable under python2.4 (i.e.
str(RemoteException(f)) would raise an AttributeError), causing problems
especially when callRemote errbacks attempted to record the received Failure
with log.msg(failure=f). This has been fixed.


* Release 0.4.0 (19 May 2009)

** Compatibility

The wire protocol remains the same as before, unchanged since 0.2.6 .

The main API entry point has moved to "foolscap.api": e.g. you should do
"from foolscap.api import Tub" instead of "from foolscap import Tub".
Importing symbols directly from the "foolscap" module is now deprecated.
(this makes it easier to reorganize the internal structure of Foolscap
without causing circular dependencies). (#122)

A near-future release (probably 0.4.1) will add proper
DeprecationWarnings-raising wrappers to all classes and functions in
foolscap/__init__.py . The next major release (probably 0.5.0) will remove
these symbols from foolscap/__init__.py altogether.

Logging functions are still meant to be imported from foolscap.logging.* .

** expose-remote-exception-types (#105)

Remote exception reporting is changing. Please see the new document
docs/failures.xhtml for full details. This release adds a new option:

 tub.setOption("expose-remote-exception-types", BOOL)

The default is True, which provides the same behavior as previous releases:
remote exceptions are presented to look as much as possible like local
exceptions.

If you set it to False, then all remote exceptions will be collapsed into a
single "foolscap.api.RemoteException" type, with an attribute named .failure
that can be used to get more details about the remote exception. This means
that callRemote will either fire its Deferred with a regular value, or
errback with one of three exception types from foolscap.api:
DeadReferenceError, Violation, or RemoteException. (When the option is True,
it could errback with any exception type, limited only by what the remote
side chose to raise)

A future version of Foolscap may change the default value of this option.
We're not sure yet: we need more experience to see which mode is safer and
easier to code with. If the default is changed, the deprecation sequence will
probably be:

 0.5.0: require expose-remote-exception-types to be set
 0.6.0: change the default to False, stop requiring the option to be set
 0.7.0: remove the option

** major bugs fixed:

Shared references now work after a Violation (#104)

The tubid returned by rref.getSturdyRef() is now reliable (#84)

Foolscap should work with python-2.6: Decimal usage fixed, sha/md5
deprecation warnings fixed, import of 'sets' still causes a warning. (#118,
#121)

Foolscap finally uses new-style classes everywhere (#96)

bin/flogtool might work better on windows now (#108)

logfiles now store library versions and process IDs (#80, #97)

The "flogtool web-viewer" tool listens at a URL of "/" instead of "/welcome",
making it slightly easier to use (#120)

You can now setOption() on both log-gatherer-furl and log-gatherer-furlfile
on the same Tub. Previously this caused an error. (#114)


* Release 0.3.2 (14 Oct 2008)

** Compatibility: same as 0.2.6

Incident classifier functions (introduced in 0.3.0) have been changed: if you
have written custom functions for an Incident Gatherer, you will need to
modify them upon upgrading to this release.

** Logging Changes

The log.msg counter now uses a regular Python integer/bigint. The counter in
0.3.1 used itertools.count(), which, despite its documentation, stores the
counter in a C signed int, and thus throws an exception when the message
number exceeds 2**31-1 . This exception would pretty much kill any program
which ran long enough to emit this many messages, a situation which was
observed in a busy production server with an uptime of about three or four
weeks. The 0.3.2 counter will be promoted to a bigint when necessary,
removing this limitation. (ticket #99)

The Incident-Gatherer now imports classification functions from files named
'classify_*.py' in the gatherer's directory. This effectively adds
"classifier plugins". The signature of the functions has changed since the
0.3.0 release, making them easier to use. If you have written custom
functions (and edited the gatherer.tac file to activate them, using
gs.add_classifier()), you will need to modify the functions to take a single
'trigger' argument.

These same 'classify_*.py' plugins are used by a new "flogtool
classify-incident" subcommand, which can be pointed at an incident file, and
performs the same kind of classification as the Incident Gatherer. (#102).

The logfiles produced by the "flogtool tail" command and the internal
incident-reporter now include the PID of the reporting process. This can be
seen by passing the --verbose option to "flogtool dump", and will be made
more visible in later releases. (#80).

The "flogtool web-viewer" tool now marks Incident triggers (#79), and
features a "Reload Logfile" button to re-read the logfile on disk (#103).
This is most useful when running unit tests, in conjunction with the
FLOGFILE= environment variable.

** Other Changes

When running unit tests, if the #62 bug is encountered (pyopenssl >= 0.7 and
twisted <= 8.1.0 and selectreactor), the test process will emit a warning and
pause for ten seconds to give the operator a chance to halt the test and
re-run it with --reactor=poll. This may help to reduce the confusion of a
hanging+failing test run.

The xfer-client.py example tool (in doc/listings/) has been made more
useable, by calling os.path.expanduser() on its input files, and by doing
sys.exit(1) on failure (instead of hanging), so that external programs can
react appropriately.


* Release 0.3.1 (03 Sep 2008)

** Compatibility: same as 0.2.6

** callRemote API changes: DeadReferenceError

All partitioning exceptions are now mapped to DeadReferenceError. Previously
there were three separate exceptions that might indicate a network partition:
DeadReferenceError, ConnectionLost, and ConnectionDone. (a network partition
is when one party cannot reach the other party, due to a variety of reasons:
temporary network failure, the remote program being shut down, the remote
host being taken offline, etc).

This means that, if you want to send a message and don't care whether that
message makes it to the recipient or not (but you *do* still care if the
recipient raises an exception during processing of that message), you can set
up the Deferred chain like this:

 d = rref.callRemote("message", args)
 d.addCallback(self.handle_response)
 d.addErrback(lambda f: f.trap(foolscap.DeadReferenceError))
 d.addErrback(log.err)

The first d.addErrback will use f.trap to catch DeadReferenceError, but will
pass other exceptions through to the log.err() errback. This will cause
DeadReferenceError to be ignored, but other errors to be logged.

DeadReferenceError will be signalled in any of the following situations:

 1: the TCP connection was lost before callRemote was invoked
 2: the connection was lost after the request was sent, but before
    the response was received
 3: when the active connection is dropped because a duplicate connection was
    established. This can occur when two programs are simultaneously
    connecting to each other.

** logging improvements

*** bridge foolscap logs into twistd.log

By calling foolscap.logging.log.bridgeLogsToTwisted(), or by setting the
$FLOGTOTWISTED environment variable (to anything), a subset of Foolscap log
events will be copied into the Twisted logging system. The default filter
will not copy events below the log.OPERATIONAL level, nor will it copy
internal foolscap events (i.e. those with a facility name that starts with
"foolscap"). This mechanism is careful to avoid loops, so it is safe to use
both bridgeLogsToTwisted() and bridgeTwistedLogs() at the same time. The
events that are copied into the Twisted logging system will typically show up
in the twistd.log file (for applications that are run under twistd).

An alternate filter function can be passed to bridgeLogsToTwisted().

This feature provides a human-readable on-disk record of significant events,
using a traditional one-line-per-event all-text sequential logging structure.
It does not record parent/child relationships, structured event data, or
causality information.

*** Incident Gatherer improvements

If an Incident occurs while a previous Incident is still being recorded (i.e.
during the "trailing log period"), the two will be folded together.
Specifically, the second event will not trigger a new Incident, but will be
recorded in the first Incident as a normal log event. This serves to address
some performance problems we've seen when incident triggers occur in
clusters, which used to cause dozens of simultaneous Incident Recorders to
swing into action.

The Incident Gatherer has been changed to only fetch one Incident at a time
(per publishing application), to avoid overloading the app with a large
outbound TCP queue.

The Incident Gatherer has also been changed to scan the classified/* output
files and reclassify any stored incidents it has that are not mentioned in
one of these files. This means that you can update the classification
functions (to add a function for some previously unknown type of incident),
delete the classified/unknown file, then restart the incident gatherer, and
it will only reclassify the previously-unknown incidents. This makes it much
easier to iteratively develop classification functions.

*** Application Version data

The table of application versions, previously displayed only by the 'flogtool
tail' command, is now recorded in the header of both Incidents and the
'flogtool tail --save-to' output file.

The API to add application versions has changed: now programs should call
foolscap.logging.app_versions.add_version(name, verstr).


* Release 0.3.0 (04 Aug 2008)

** Compatibility: same as 0.2.6

The wire-level protocol remains the same as other recent releases.

The new incident-gatherer will only work with applications that use Foolscap
0.3.0 or later.

** logging improvements

The "incident gatherer" has finally been implemented. This is a service, like
the log-gatherer, which subscribes to an application's logport and collects
incident reports: each is a dump of accumulated log messages, triggered by
some special event (such as those above a certain severity threshold). The
"flogtool create-incident-gatherer" command is used to create one, and twistd
is used to start it. Please see doc/logging.xhtml for more details.

The incident publishing API was changed to support the incident-gatherer. The
incident-gatherer will only work with logports using foolscap 0.3.0 or newer.

The log-publishing API was changed slightly, to encourage the use of
subscription.unsubscribe() rather than publisher.unsubscribe(subscription).
The old API remains in place for backwards compatibility with log-gatherers
running foolscap 0.2.9 or earlier.

The Tub.setOption("log-gatherer-furlfile") can accept a file with multiple
FURLs, one per line, instead of just a single FURL. This makes the
application contact multiple log gatherers, offering its logport to each
independently, e.g. to connect to both a log-gatherer and an
incident-gatherer.

** API Additions

RemoteReferences now have a getRemoteTubID() method, which returns a string
(base32-encoded) representing the secure Tub ID of the remote end of the
connection. For any given Tub ID, only the possessor of the matching private
key should be able to provide a RemoteReference for which getRemoteTubID()
will return that value. I'm not yet sure if getRemoteTubID() is a good idea
or not (the traditional object-capability model discourages making
access-control decisions on the basis of "who", instead these decisions
should be controlled by "what": what objects do they have access to). This
method is intended for use by application code that needs to use TubIDs as an
index into a table of some sort. It is used by Tahoe to securely compute
shared cryptographic secrets for each remote server (by hashing the TubID
together with some other string).

Note that the rref.getSturdyRef() call (which has been present in Foolscap
since forever) is *not* secure: the remote application controls all parts of
the sturdy ref FURL, including the tubid. A future version of foolscap may
remedy this.

** Bug fixes

The log-gatherer FURL can now be set before Tub.setLocation (the connection
request will be deferred until setLocation is called), and
getLogPort/getLogPortFURL cannot be called until after setLocation. These two
changes, in combination, resolve a problem (#55) in which the gatherer
connection could be made before the logport was ready, causing the
log-gatherer to fail to subscribe to receive log events.

** Dependent Libraries

Foolscap uses PyOpenSSL for all of its cryptographic routines. A bug (#62)
has been found in which the current version of Twisted (8.1.0) and the
current version of PyOpenSSL (0.7) interact badly, causing Foolscap's unit
tests to fail. This problem will affect application code as well
(specifically, Tub.stopService will hang forever). The problem only appears
to affect the selectreactor, so the current recommended workaround is to run
unit tests (and applications that need to shut down Tubs) with --reactor=poll
(or whatever other reactor is appropriate for the platform, perhaps iocp). A
less-desireable workaround is to downgrade PyOpenSSL to 0.6, or Twisted to
something older. The Twisted maintainers are aware of the problem and intend
to fix it in an upcoming Twisted release.


* Release 0.2.9 (02 Jul 2008)

** Compatibility: exactly the same as 0.2.6

** logging bugs fixed

The foolscap.logging.log.setLogDir() option would throw an exception if the
directory already existed, making it unsuitable for use in an application
which is expected to be run multiple times. This has been fixed.

** logging improvements

'flogtool tail' now displays the process ID and version information about the
remote process. The tool will tolerate older versions of foolscap which do
not offer the get_pid interface. (foolscap ticket #71)

The remote logport now uses a size-limited queue for messages going to a
gatherer or 'flogtool tail', to prevent the monitored process from using
unbounded amounts of memory during overload situations (when it is generating
messages faster than the receiver can handle them). This solves a runaway
load problem we've seen in Tahoe production systems, in which a busy node
sends log messages to a gatherer too quickly for it to absorb, using lots of
memory to hold the pending messages, which causes swapping, which causes more
load, making the problem worse. We frequently see an otherwise well-behaved
process swell to 1.4GB due to this problem, occasionally failing due to VM
exhaustion. Of course, a bounded queue means that new log events will be
dropped during this overload situation. (#72)

** serialization added for the Decimal type (#50)

** debian packaging targets added for gutsy and hardy

The Makefile now has 'make debian-gutsy' and 'make debian-hardy' targets.
These do the same thing as 'make debian-feisty'. (#76)


More information about the tahoe-dev mailing list