[tahoe-lafs-trac-stream] [Tahoe-LAFS] #3763: Potential issues with `PUT /v1/lease/:storage_index` in GBS protocol
Tahoe-LAFS
trac at tahoe-lafs.org
Tue Aug 17 14:21:41 UTC 2021
#3763: Potential issues with `PUT /v1/lease/:storage_index` in GBS protocol
--------------------------+-----------------------------------
Reporter: itamarst | Owner: exarkun
Type: task | Status: new
Priority: normal | Milestone: HTTP Storage Protocol
Component: unknown | Version: n/a
Resolution: | Keywords:
Launchpad Bug: |
--------------------------+-----------------------------------
Comment (by exarkun):
> 1. It says "If there are no shares for the given storage_index then do
nothing and return NO CONTENT." 204 is a success code... this seems like
maybe more like an error, and should therefore have 4xx code?
In e9563ebc02a67314060f38e8533e2e1e551ce9a6 warner changes the Foolscap-
based implementation to return None instead of raising an exception. The
commit message is:
> change RIStorageServer.remote_add_lease to exit silently in case of no-
such-bucket, instead of raising IndexError, because that makes the
upcoming --add-lease feature faster and less noisy
I failed to find much more detail about this. Is it only about the fact
that exceptions are slow? Maybe transferring exceptions via Foolscap is
even slower? If that is all then it seems like we don't have to mirror
this choice in GBS. It seems like a good idea to tell the client that
they tried to do something and the server did not do it, rather than let
them believe the server did do it.
I think 410 is meant for cases where the server specifically knows the
resource used to exist. I don't think the server will necessarily know
this. A 404 might make the most sense?
> 2. What should be returned if there are shares? Anything need to be in
the response body? Just 200?
The Foolscap-based implementation returns None. That makes me think an
empty body makes sense. For the status code, either 200 or 201. 201
seems like the right fit to me, since we *did* just "create" a lease.
Except https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/201 says
its for *resource* creation and we should return `Location` in the
response header ... so I guess not. Just a 200.
> 3. "If the renew-secret value matches an existing lease then that lease
will be renewed instead." Why is this, given there exists a separate renew
API endpoint? This seems like it could mask certain categories of client
bugs.
I looked around the codebase a bit. I tried to find code that uses
`add_lease` and see what the consequences of it being mistakenly used
instead of `renew_lease` might be. I also looked for users of
`renew_lease` and tried to understand why they were correct (as opposed to
a use of `add_lease`). I didn't come up with much.
One part of the server code has this:
{{{
def add_or_renew_lease(self, lease_info):
try:
self.renew_lease(lease_info.renew_secret,
lease_info.expiration_time)
except IndexError:
self.add_lease(lease_info)
}}}
where the difference between the two is flagrantly ignored...
I think the high-level view here might be that leases are a very lightly
used Tahoe feature and garbage collection may be even more lightly used -
so the consequences of getting lease logic wrong are minimal or non-
existent.
I *feel* like the overlap between `add_lease` and `renew_lease` is a
mistake. It would be convenient if the `renew_lease` functionality were
*100%* encompassed by that overlap (because then `renew_lease` could be
deleted) but there is the annoying `cancel_secret`...
Trying to enumerate options...
1. Leave it as-is (and/or fix it later)
2. Remove `renew_lease` from the network and Python APIs; adjust all
callers of the Python `renew_lease` API to use the `add_lease` Python API
instead.
3. Remove `renew_lease` from the network API; (re-)implement the Python
API to use the `add_lease` network API instead; invent a `cancel_secret`
along the way (maybe some kind of null value that indicates an
uncancellable lease).
4. Remove `renew_lease` from the network API; add `cancel_secret` to the
Python API; (re-)implement the Python API to use `add_lease` instead and
to provide a good `cancel_secret`.
5. Remove the renew semantics from the `add_lease` Python and network API;
update callers to provide appropriate error handling on a case-by-case
basis
6. Remove the renew semantics from the `add_lease` network API;
(re-)implement the Python API to provide error handling that automatically
falls back to using the `renew_lease` network API.
Probably should stop typing at this point and see if anyone else finds any
of this coherent.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3763#comment:4>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list