Opened at 2007-12-09T13:20:41Z
Last modified at 2020-12-09T14:13:06Z
#227 closed defect
our automated memory measurements might be measuring the wrong thing — at Version 10
Reported by: | zooko | Owned by: | zooko |
---|---|---|---|
Priority: | major | Milestone: | eventually |
Component: | dev-infrastructure | Version: | 0.7.0 |
Keywords: | memory performance unix | Cc: | zandr |
Launchpad Bug: |
Description (last modified by warner)
As visible in the memory usage graphs, pycryptopp increased the static memory footprint by about 6 MiB when we added it in early November (I think it was November 6, although the Performance page says November 9), and removing pycrypto on 2007-12-03 seems to have had almost no benefit in reducing memory footprint.
This reminds me of the weirdness about the 64-bit version using way more memory than we expected.
Hm. I think maybe we are erring by using "VmSize?" (from /proc/*/status) as our proxy for memory usage. That number is the total size of the virtual address space requested by the process, if I understand correctly. So for example, mmap'ing a file adds the file's size to your VmSize?, although it does not (by itself) use any memory.
Linux kernel hackers seem to be in universal agreement that it is a bad idea to use VmSize? for anything:
http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html http://lwn.net/Articles/230975/
But what's the alternative? We could read "smaps" and see if we can get a better metric out of that.
By the way, if anyone wants to investigate more closely the memory usage, the valgrind tool named massif has been rewritten so maybe it will work this time.
Change History (10)
comment:1 Changed at 2008-01-01T22:48:21Z by zooko
- Summary changed from pycryptopp uses up a 6 MB of memory (or at least it increases VmSize by 6M) to our automated memory measurements might be measuring the wrong thing
comment:2 Changed at 2008-01-01T22:52:18Z by zooko
- Component changed from unknown to dev-infrastructure
- Owner changed from nobody to zooko
- Status changed from new to assigned
comment:3 Changed at 2008-01-04T04:30:25Z by zooko
Ooh, and as Seb just reminded me, I can turn off overcommit first too, to make it more deterministic/analyzable.
comment:4 Changed at 2008-01-17T02:37:00Z by zooko
Please see:
http://allmydata.org/pipermail/tahoe-dev/2008-January/000341.html
Zandr: how would you feel about turning off swap for tahoeslave-feisty and for zandr-64? I believe that turning off swap is necessary in order to get a useful measurement of memory. (Personally, I turn off swap on my Linux systems anyway.)
I think that turning off memory overcommit isn't strictly necessary for doing measurements, but it might help by showing memory exhaustion errors in a more deterministic way than the Linux OOM killer.
comment:6 Changed at 2009-05-04T17:44:17Z by zooko
By the way, I don't think I succeeded at boiling down the results of my research for the consumption of others. Here's the boiled-down version: measuring the vsize as we do in our Performance page gives a number much higher than what we actually want to know, and it changes even when the thing that we care about hasn't changed, so it is either useless or only barely useful. Measuring the resident set size would give something probably smaller or possibly larger than the thing we want to know, and it too would change randomly when the thing we care about hasn't changed. The two of them put together and then eyeballed might give you insight, or might just mislead you.
The idea that I had and wrote up in this ticket (above) was a third option: turn off swap and measure resident. That gives you a number that is probably pretty close to what you care about, if what you care about is something like "How much RAM do I need in my machine to run one Tahoe node without it needing to swap.". (If you have a different idea of what you want to know then by all means speak up.)
Anyway, that's all my attempt to restate the history of this ticket and explain why you shouldn't pay much if any attention to the numbers on the Performance page. The new news is that Matt Mackall has been working on this problem and has a new tool that can help (on Linux):
comment:7 Changed at 2009-05-04T17:46:30Z by zooko
comment:8 Changed at 2009-06-17T20:04:38Z by zooko
Here's the permanent URL for that LWN.net article: Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: http://lwn.net/Articles/329458/ .
comment:9 Changed at 2009-12-04T05:16:08Z by davidsarah
- Keywords memory performance unix added
comment:10 Changed at 2015-03-20T20:26:58Z by warner
- Description modified (diff)
I took a quick look at smem today, seems pretty nice. I think the "USS" (Unique Set Size) might be a good thing to track: it's the amount of memory you'd get back by killing the process. For Tahoe, the main thing we care about is that the client process isn't leaking or over-allocating the memory used to hold files during the upload/download process, and that memory isn't going to be shared with any other process. So even if it doesn't answer the "can I fit this tahoe node/workload on my NN-MB computer", it *does* answer the question of whether we're meeting our memory-complexity design goals.
Installing smem requires a bunch of other stuff (python-gtk2, python-tk, matplotlib), since it has a graphical mode that we don't care about, but that's not a big deal. There's a process-filter thing which I can't find documentation on, which we'd need to limit the output to the tahoe client's own PID. And then the main downside I can think of is that you have to shell out to a not-small python program for each sample (vs reading /proc/self/status, which is basically free), so somebody might be worried about the performance impact.
Here is a way to test whether your memory measurement is giving you useful answers. Take a machine with little physical RAM -- I have one here with 500 MB -- turn off swap, and start more and more Tahoe clients and start each one doing the "upload" operation until eventually you get malloc failures or Linux OOM kills or whatever.
Now divide your physical RAM by the number of Tahoe clients that you were able to run without incurring memory problems. The result of that division is a reasonable approximation of the "memory requirements" of the current Tahoe client.
This sounds like fun -- I'll accept this ticket.