#3854 closed defect (fixed)

builtins.TypeError: write() argument must be str, not bytes from allmydata/webish.py usage of FileUploadFieldStorage

Reported by: exarkun Owned by: itamarst
Priority: normal Milestone: undecided
Component: unknown Version: n/a
Keywords: python3 Cc:
Launchpad Bug:

Description

On Python 3.9 when issuing some request to the Tahoe-LAFS web API, this traceback comes up:

 2022-01-04T10:36:04-0500 [_GenericHTTPChannelProtocol,1,127.0.0.1] Unhandled Error
        Traceback (most recent call last):
          File "python3.9/site-packages/twisted/python/log.py", line 103, in callWithLogger
            return callWithContext({"system": lp}, func, *args, **kw)
          File "python3.9/site-packages/twisted/python/log.py", line 86, in callWithContext
            return context.call({ILogContext: newCtx}, func, *args, **kw)
          File "python3.9/site-packages/twisted/python/context.py", line 122, in callWithContext
            return self.currentContext().callWithContext(ctx, func, *args, **kw)
          File "python3.9/site-packages/twisted/python/context.py", line 85, in callWithContext
            return func(*args,**kw)
        --- <exception caught here> ---
          File "python3.9/site-packages/twisted/internet/posixbase.py", line 614, in _doReadOrWrite
            why = selectable.doRead()
          File "python3.9/site-packages/twisted/internet/tcp.py", line 243, in doRead
            return self._dataReceived(data)
          File "python3.9/site-packages/twisted/internet/tcp.py", line 249, in _dataReceived
            rval = self.protocol.dataReceived(data)
          File "python3.9/site-packages/twisted/web/http.py", line 3024, in dataReceived
            return self._channel.dataReceived(data)
          File "python3.9/site-packages/twisted/web/http.py", line 2305, in dataReceived
            return basic.LineReceiver.dataReceived(self, data)
          File "python3.9/site-packages/twisted/protocols/basic.py", line 579, in dataReceived
            why = self.rawDataReceived(data)
          File "python3.9/site-packages/twisted/web/http.py", line 2312, in rawDataReceived
            self._transferDecoder.dataReceived(data)
          File "python3.9/site-packages/twisted/web/http.py", line 1755, in dataReceived
            finishCallback(data[contentLength:])
          File "python3.9/site-packages/twisted/web/http.py", line 2171, in _finishRequestBody
            self.allContentReceived()
          File "python3.9/site-packages/twisted/web/http.py", line 2284, in allContentReceived
            req.requestReceived(command, path, version)
          File "python3.9/site-packages/allmydata/webish.py", line 134, in requestReceived
            self.fields = FileUploadFieldStorage(
          File "python3.9/cgi.py", line 482, in __init__
            self.read_single()
          File "python3.9/cgi.py", line 675, in read_single
            self.read_binary()
          File "python3.9/cgi.py", line 697, in read_binary
            self.file.write(data)
        builtins.TypeError: write() argument must be str, not bytes

I'm not exactly sure yet what request triggers this.

Change History (9)

comment:1 Changed at 2022-01-04T16:55:00Z by itamarst

  • Owner set to itamarst

comment:2 Changed at 2022-01-04T16:59:30Z by itamarst

I wonder if this has to do with the hack in FileUploadFieldStorage.

comment:3 Changed at 2022-01-04T17:04:01Z by itamarst

Ah, so:

In order to workaround problems in Python 3's decision about whether something is bytes or unicode, we implement a heuristic that says "if the 'name' MIME field of the upload was 'file', assume it's bytes."

And my guess is that heuristic isn't good enough for all clients, so we need some more aggressive heuristic. I guess we could just say "it's always bytes"? But that might break some other bits, like the web UI.

comment:4 Changed at 2022-01-04T17:06:46Z by itamarst

One problem with current heuristic is that it breaks Python 3's "'filename' field was set" heuristic. So that's one change to make.

And then clients could be required to set that going forward.

comment:5 Changed at 2022-01-04T17:07:15Z by itamarst

Anyway I would look for uploads in the client code that is triggering this to get a reproducer.

comment:6 Changed at 2022-01-06T13:53:05Z by exarkun

The request that triggers the traceback is a POST to /storage-plugins/privatestorageio-zkapauthz-v1/calculate-price (so, not a first-part resource).

The headers are:

{ 'content-length': '2433'
, 'authorization': 'tahoe-lafs O_0Cs...'
, 'content-type': 'application/json'
, 'accept-encoding': 'gzip'
, 'host': '127.0.0.1:39053'
}

There are some first-party resources that accept JSON so maybe it is possible to reproduce this without involving third-party plugins. Although I don't know what it means that there are not already any failing unit tests for this code path (apart from the obvious guess of incomplete test coverage).

Looking at docs/frontends/webapi.rst I see POST /uri?t=mkdir-with-children which should be pretty similar (there are also a lot of variations on that action, "create a directory", that also take JSON bodies).

The docs *don't* say that a content-type: application/json header is required in these cases. I don't know if that's a relevant distinction or not.

comment:7 Changed at 2022-01-06T13:54:07Z by exarkun

  • Keywords python3 added

comment:8 Changed at 2022-01-06T17:35:35Z by itamarst

So now my guess is that the bug is "the code does multipart/form-data parsing even when that MIME type isn't set." Which in Python 2 would be harmless waste of effort, and I guess blows up in Python 3.

comment:9 Changed at 2022-01-07T19:37:42Z by itamarst

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.