[tahoe-lafs-trac-stream] [tahoe-lafs] #2024: downloader hangs when server returns empty string

Wed Jul 17 15:16:20 UTC 2013

#2024: downloader hangs when server returns empty string
-------------------------+-------------------------------------------------
     Reporter:  warner   |      Owner:
         Type:  defect   |     Status:  new
     Priority:  normal   |  Milestone:  eventually
    Component:  code-    |    Version:  1.10.0
  encoding               |   Keywords:  download hang denial-of-service
   Resolution:           |  security
Launchpad Bug:           |
-------------------------+-------------------------------------------------
Changes (by daira):

 * keywords:   => download hang denial-of-service security

Old description:

> While investigating the {{{test_download.Corrupt.test_each_byte
> catalog_detection=True}}} failure for #1382, after fixing the bitrot, we
> discovered that the downloader hangs when the server responds to a read
> request with zero bytes. In particular, when the test corrupts offset 8
> (which is the MSB of the {{{num_leases}}} value), the storage server
> believes that the container includes a ridiculous number of leases,
> therefore the leases start ({{{self._lease_offset}}}) before the
> beginning of the file, therefore all reads are truncated down to nothing.
>
> The actual bug is that the downloader doesn't handle this well. It looks
> like, when the read fails to satisfy any of the desired bytes, the
> downloader just issues a new read request, identical to the first. It
> then loops forever, trying to fetch the same range and always failing.
> This is an availability problem, since a corrupt/malicious server could
> prevent downloads from proceeding by mangling its responses in this way.
>
> Instead, the downloader should never ask for a given range of bytes twice
> from the same storage server (at least without some intervening event
> like a reconnection). So the downloader will need to remember what it
> asked for, and if it doesn't get it, add those offsets to a list of
> "bytes that this server won't give us". Then, if we absolutely need any
> bytes that appear in that list, we declare the Share to be a loss and
> switch to a different one.
>
> A simpler rule would probably work too: any zero-length reads are grounds
> to reject the share. We do some speculative reads (based upon assumptions
> about the segment size), so we'd need to look carefully at that code and
> make sure that the speculation cannot correctly yield zero bytes. In
> particular I'm thinking about the UEB fetch from the end of the share:
> its offset depends upon the size of the hash trees, so if our guessed
> segment size is too small, the UEB fetch might return zero bytes, but the
> correct thing to do is to re-fetch it from the correct place once we've
> grabbed the offset table.
>
> The workaround I recommended for markb's work in #1382 is to just refrain
> from corruption those four {{{num_leases}}} bytes. Once we fix this
> ticket, we should go back to {{{test_download.py}}} and remove that
> workaround.

New description:

 While investigating the {{{test_download.Corrupt.test_each_byte
 catalog_detection=True}}} failure for #1382, after fixing the bitrot, we
 discovered that the downloader hangs when the server responds to a read
 request with zero bytes. In particular, when the test corrupts offset 8
 (which is the MSB of the {{{num_leases}}} value), the storage server
 believes that the container includes a ridiculous number of leases,
 therefore the leases start ({{{self._lease_offset}}}) before the beginning
 of the file, therefore all reads are truncated down to nothing.

 The actual bug is that the downloader doesn't handle this well. It looks
 like, when the read fails to satisfy any of the desired bytes, the
 downloader just issues a new read request, identical to the first. It then
 loops forever, trying to fetch the same range and always failing. This is
 an availability problem, since a corrupt/malicious server could prevent
 downloads from proceeding by mangling its responses in this way.

 Instead, the downloader should never ask for a given range of bytes twice
 from the same storage server (at least without some intervening event like
 a reconnection). So the downloader will need to remember what it asked
 for, and if it doesn't get it, add those offsets to a list of "bytes that
 this server won't give us". Then, if we absolutely need any bytes that
 appear in that list, we declare the Share to be a loss and switch to a
 different one.

 A simpler rule would probably work too: any zero-length reads are grounds
 to reject the share. We do some speculative reads (based upon assumptions
 about the segment size), so we'd need to look carefully at that code and
 make sure that the speculation cannot correctly yield zero bytes. In
 particular I'm thinking about the UEB fetch from the end of the share: its
 offset depends upon the size of the hash trees, so if our guessed segment
 size is too small, the UEB fetch might return zero bytes, but the correct
 thing to do is to re-fetch it from the correct place once we've grabbed
 the offset table.

 The workaround I recommended for markb's work in #1382 is to just refrain
 from corrupting those four {{{num_leases}}} bytes. Once we fix this
 ticket, we should go back to {{{test_download.py}}} and remove that
 workaround.

--

Comment:

 Replying to [ticket:2024 warner]:
 > A simpler rule would probably work too: any zero-length reads are
 grounds to reject the share.

 That wouldn't work against a malicious server that, say, returns 1 byte.

 > The workaround I recommended for markb's work in #1382 is to just
 refrain from corrupting those four {{{num_leases}}} bytes. Once we fix
 this ticket, we should go back to {{{test_download.py}}} and remove that
 workaround.

 +1.

-- 
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/2024#comment:1>
tahoe-lafs <https://tahoe-lafs.org>
secure decentralized storage