[tahoe-lafs-trac-stream] [Tahoe-LAFS] #1290: replace all use of pickles with JSON

Tahoe-LAFS trac at tahoe-lafs.org
Thu Apr 28 07:52:59 UTC 2016


#1290: replace all use of pickles with JSON
----------------------------+----------------------------------
     Reporter:  davidsarah  |      Owner:  somebody
         Type:  defect      |     Status:  new
     Priority:  major       |  Milestone:  undecided
    Component:  code        |    Version:  1.8.1
   Resolution:              |   Keywords:  security pickle json
Launchpad Bug:              |
----------------------------+----------------------------------

Old description:

> The [http://docs.python.org/library/pickle.html pickle] format is
> specific to Python. Loading pickles allows arbitrary code execution (by
> design) and has been subject to
> [http://scarybeastsecurity.blogspot.com/2008/10/some-python-bugs.html
> memory corruption bugs].
>
> The security exposure in Tahoe-LAFS is in practice not too bad because we
> only use pickles as private state, and it could be argued that a storage
> server has security problems anyway if an attacker can write to the
> filesystem under its node directory. Still, the potential for memory
> corruption is not nice.
>
> We currently read and write pickles:
>  * in {{{PickleStatsGatherer}}} at [source:src/allmydata/stats.py#L245]
>  * in {{{ShareCrawler}}} in [source:src/allmydata/storage/crawler.py]
>  * in {{{LeaseCheckingCrawler}}} (subclass of {{{ShareCrawler}}}) in
> [source:src/allmydata/storage/expirer.py]
>  * in [source:misc/operations_helpers/cpu-watcher.tac]
>
> If all of these uses of pickles were simply replaced with JSON, the state
> of crawls in progress at the time of the upgrade would be lost. This
> seems acceptable to me; I don't see any need to support resuming an
> interrupted crawl from a pickle written by a previous version.
>
> See also #1280 and #561.

New description:

 The [http://docs.python.org/library/pickle.html pickle] format is specific
 to Python. Loading pickles allows arbitrary code execution (by design) and
 has been subject to [http://scarybeastsecurity.blogspot.com/2008/10/some-
 python-bugs.html memory corruption bugs].

 The security exposure in Tahoe-LAFS is in practice not too bad because we
 only use pickles as private state, and it could be argued that a storage
 server has security problems anyway if an attacker can write to the
 filesystem under its node directory. Still, the potential for memory
 corruption is not nice.

 We currently read and write pickles:
  * ~~in {{{PickleStatsGatherer}}} at
 [source:src/allmydata/stats.py#L245]~~
  * in {{{ShareCrawler}}} in [source:src/allmydata/storage/crawler.py]
  * in {{{LeaseCheckingCrawler}}} (subclass of {{{ShareCrawler}}}) in
 [source:src/allmydata/storage/expirer.py]
  * in [source:misc/operations_helpers/cpu-watcher.tac]

 If all of these uses of pickles were simply replaced with JSON, the state
 of crawls in progress at the time of the upgrade would be lost. This seems
 acceptable to me; I don't see any need to support resuming an interrupted
 crawl from a pickle written by a previous version.

 See also #1280 and #561.

--

Comment (by warner):

 `PickleStatsGatherer` is gone, replaced by `JSONStatsGatherer`, in c9047b1
 (which is after tahoe-lafs-1.11.0, and should be in 1.12.0).

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1290#comment:5>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list