[tahoe-dev] [tahoe-lafs] #671: bring back sizelimit (i.e. max consumed, not min free) (was: sizelimit)

Mon Nov 30 13:43:47 PST 2009

#671: bring back sizelimit (i.e. max consumed, not min free)
----------------------------+-----------------------------------------------
 Reporter:  zooko           |           Owner:       
     Type:  defect          |          Status:  new  
 Priority:  major           |       Milestone:  1.6.0
Component:  code-nodeadmin  |         Version:  1.3.0
 Keywords:                  |   Launchpad_bug:       
----------------------------+-----------------------------------------------

Old description:

> We used to have a {{{sizelimit}}} option which would do a recursive
> examination of the storage directory at startup and calculate
> approximately how much disk space was used, and refuse to accept new
> shares if the disk space would exceed the limit.  #34 shows when it was
> implemented.  It was later removed because it took a long time -- about
> 30 minutes -- on allmydata.com storage servers, and the servers remained
> unavailable to clients during this period, and because it was replaced by
> the {{{reserved_space}}} configuration, which was very fast and which
> satisfied the requirements of the allmydata.com storage servers.
>
> This ticket is to reintroduce {{{sizelimit}}} because
> [http://allmydata.org/pipermail/tahoe-dev/2009-March/001493.html some
> users want it].  This will mean that the storage server doesn't start
> serving clients until it finishes the disk space inspection at startup.
>
> To close this ticket, you do *not* need to implement some sort of
> interleaving of inspecting disk space and serving clients.
>
> To close this ticket, you MUST NOT implement any sort of automatic
> deletion of shares to get back under the sizelimit if you find yourself
> over it (for example, if the user has changed the sizelimit to be lower
> after you've already filled it to the max), but you SHOULD implement some
> sort of warning message to the log if you detect this condition.

New description:

 We used to have a {{{sizelimit}}} option which would do a recursive
 examination of the storage directory at startup and calculate
 approximately how much disk space was used, and refuse to accept new
 shares if the disk space would exceed the limit.  #34 shows when it was
 implemented.  It was later removed because it took a long time -- about 30
 minutes -- on allmydata.com storage servers, and the servers remained
 unavailable to clients during this period, and because it was replaced by
 the {{{reserved_space}}} configuration, which was very fast and which
 satisfied the requirements of the allmydata.com storage servers.

 This ticket is to reintroduce {{{sizelimit}}} because
 [http://allmydata.org/pipermail/tahoe-dev/2009-March/001493.html some
 users want it].  This might mean that the storage server doesn't start
 serving clients until it finishes the disk space inspection at startup.

 Note that {{{sizelimit}}} would impose a maximum limit on the amount of
 space consumed by the node's {{{storage/shares/}}} directory, whereas
 {{{reserved_space}}} imposes a minimum limit on the amount of remaining
 available disk space. In general, {{{reserved_space}}} can be implemented
 by asking the OS for filesystem stats, whereas {{{sizelimit}}} must be
 implemented by tracking the node's own usage and accumulating the sizes
 over time.

 To close this ticket, you do *not* need to implement some sort of
 interleaving of inspecting disk space and serving clients.

 To close this ticket, you MUST NOT implement any sort of automatic
 deletion of shares to get back under the sizelimit if you find yourself
 over it (for example, if the user has changed the sizelimit to be lower
 after you've already filled it to the max), but you SHOULD implement some
 sort of warning message to the log if you detect this condition.

--

Comment(by warner):

 (updated description)

 Note that any sizelimit code is allowed to speed things up by remembering
 state from one run to the next. The old code did the slow recursive-
 traversal sharewalk to handle the (important) case where this state was
 inaccurate or unavailable (i.e. when shares had been deleted by some
 external process, or to handle the local-fs-level overhead that accounts
 for the difference between what /bin/ls and /bin/df each report). But we
 could trade off accuracy for speed: it should be acceptable to just ensure
 that the sizelimit is eventually approximately correct.

 A modern implementation should probably use the "share crawler" mechanism,
 doing a {{{stat}}} on each share, and adding up the results. It can store
 state in the normal crawler stash, probably in the form of a single total-
 bytes value per prefixdir. The do-I-have-space test should use {{{max
 (last-pass, current-pass)}}}, to handle the fact that the current-pass
 value will be low while the prefixdir is being scanned. The crawler would
 replace this state on each pass, so any stale information would go away
 within a few hours or days.

 Ideally, the server code should also keep track of new shares that were
 written into each prefixdir, and add the sizes of those shares to the
 state value, but only until the next crawler pass had swung by and seen
 the new shares. You'd also want do to something similar with shares that
 were deleted (by the lease expirer). To accomplish this, you'd want to
 make a {{{ShareCrawler}}} subclass that tracks this extra space in a per-
 prefixdir dict, and have the storage-server/lease-expirer notify it every
 time a share was created or deleted. The {{{ShareCrawler}}} subclass is in
 the right position to know when the crawler has reached a bucket.

 Doing this with the crawler would also have the nice side-effect of
 balancing fast startup with accurate size limiting. Even though this
 ticket has been defined as not requiring such a feature, I'm sure users
 would appreciate it.

-- 
Ticket URL: <http://allmydata.org/trac/tahoe/ticket/671#comment:1>
tahoe-lafs <http://allmydata.org>
secure decentralized file storage grid