#2373 new enhancement

adopt pyrsistent

Reported by: zooko Owned by:
Priority: normal Milestone: undecided
Component: unknown Version: 1.10.0
Keywords: Cc:
Launchpad Bug:

Description

https://warehouse.python.org/project/pyrsistent/

Immutable data structures. They reduce bugs by avoiding the situation where someone mutates a data structure that someone else was counting on to stay the same. We already try to use frozenset instead of set or dict, where-ever we can, but the pyrsistent library also adds immutable dicts and many other useful tools. I still remember staying up all night with Daira one time tracking down a bug just before the Tahoe-LAFS v1.8 release, which turned out to be an accidentally-mutated dict…

One huge drawback to adopting this, of course, is that it is another dependency. However, it is pure-Python and I don't think they're going to be making backwards-incompatible changes to the API, so it is a lot less costly than some of our dependencies.

Change History (2)

comment:1 Changed at 2015-02-05T17:11:12Z by daira

+1 for the basic idea: I like persistent data structures and they're essentially as expressive as mutable ones, while being less error-prone.

+1 for pyrsistent's API which appears, on a cursory reading, to be well-thought-out and reasonably complete. I particularly like the PRecord abstraction which we could make good use of in many places.

+0.3 for pyrsistent's unit test suite. This is not a full +1 because the tests, although they cover the external interfaces reasonably well, are only "black-box" tests that do not attempt to check the internal representations of the persistent structures or to exercise any hard cases. In particular, they typically only test structures with a small number of elements, which does not properly cover all of the implementation, because most of the structures are based on trees with an arity of 32.

-1 because pyrsistent has a C extension which is optionally used. This potentially introduces build problems if the build process has a problem compiling the extension and fails to correctly fall back to the pure-Python code.

Note that I don't think we use frozenset and tuples in all the places we could. Where possible, it is better to use a truly immutable data structure rather than a persistent data structure, both for efficiency and understandability.

comment:2 Changed at 2020-01-20T20:50:54Z by exarkun

pyrsistent is quite slow without the C extensions. Even with them, libraries like Eliot have observed that a lot of time ends up spent in pyrsistent implementation (I have an idea that for Eliot, native CPython types were found to be faster but I can't produce evidence so don't assign this much weight).

The big downsides discussed above have been neutralized at this point as Tahoe-LAFS now already has pyrsistent as a transient dependency. I haven't noticed any particular problems arising from this. I think that binary wheels are now supported well-enough by the relevant parts of the install toolchain that simple extension modules like this are not really a big deal any more (on the other hand, it could mean that few newer users are adopting Tahoe-LAFS these days and everyone else has long since figured out how to keep it working).

Anyway, I think this should probably be done. That said, someone has to actually rewrite some part of Tahoe-LAFS to make use of pyrsistent instead of some other data structures...

Note: See TracTickets for help on using tickets.