[tahoe-lafs-trac-stream] [Tahoe-LAFS] #3672: UnicodeDecodeError in Eliot messages

Tahoe-LAFS trac at tahoe-lafs.org
Wed Apr 21 13:58:02 UTC 2021


#3672: UnicodeDecodeError in Eliot messages
--------------------------+------------------------------
     Reporter:  itamarst  |      Owner:
         Type:  defect    |     Status:  new
     Priority:  normal    |  Milestone:  Support Python 3
    Component:  unknown   |    Version:  n/a
   Resolution:            |   Keywords:
Launchpad Bug:            |
--------------------------+------------------------------

Comment (by itamarst):

 More context:

 1. Eliot was originally developed on Python 2, where bytestrings were the
 norm.
 2. JSON doesn't know about bytes.

 For JSON serialization Eliot therefore followed Python's lead, where if
 bytes looked like a UTF-8-encoded unicode string, they were serialized as
 a JSON string.

 With Python 3, bytestrings are no longer the default. Which means bytes
 are more likely to be ... bytes, and so on Python 3 Eliot decided not to
 handle bytes by default in log messages, since it's not clear what the
 correct thing to do is. How to handle them is left up to individual
 applications.

 As a result, Tahoe-LAFS on Python 3 needs a policy decision on how to
 handle byte serialization. The initial policy decision was "handle bytes
 that look like UTF-8-encoded unicode strings".

 However, it turns out Tahoe actually logs random byte strings, some of
 which are very much not UTF-8 decodable. This PR allows Tahoe to continue
 doing so by using hex quoting when necessary.

--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3672#comment:1>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage


More information about the tahoe-lafs-trac-stream mailing list