[tahoe-lafs-trac-stream] [Tahoe-LAFS] #227: our automated memory measurements might be measuring the wrong thing
Tahoe-LAFS
trac at tahoe-lafs.org
Fri Mar 20 20:26:59 UTC 2015
#227: our automated memory measurements might be measuring the wrong thing
------------------------------------+-------------------------------------
Reporter: zooko | Owner: zooko
Type: defect | Status: assigned
Priority: major | Milestone: eventually
Component: dev-infrastructure | Version: 0.7.0
Resolution: | Keywords: memory performance unix
Launchpad Bug: |
------------------------------------+-------------------------------------
Old description:
> As visible in [http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata
> .com-tahoe_memstats.html the memory usage graphs], pycryptopp increased
> the static memory footprint by about 6 MiB when we added it in early
> November (I think it was November 6, although [wiki:Performance the
> Performance page] says November 9), and removing pycrypto on 2007-12-03
> seems to have had almost no benefit in reducing memory footprint.
>
> This reminds me of the weirdness about the 64-bit version using way more
> memory than we expected.
>
> Hm. I think maybe we are erring by using "VmSize" (from /proc/*/status)
> as our proxy for memory usage. That number is the total size of the
> virtual address space requested by the process, if I understand
> correctly. So for example, mmap'ing a file adds the file's size to your
> VmSize, although it does not (by itself) use any memory.
>
> Linux kernel hackers seem to be in universal agreement that it is a bad
> idea to use VmSize for anything:
>
> http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html
> http://lwn.net/Articles/230975/
>
> But what's the alternative? We could read "smaps" and see if we can get
> a better metric out of that.
>
> By the way, if anyone wants to investigate more closely the memory usage,
> the valgrind tool named massif has been rewritten so maybe it will work
> this time.
New description:
As visible in [http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata
.com-tahoe_memstats.html the memory usage graphs], pycryptopp increased
the static memory footprint by about 6 MiB when we added it in early
November (I think it was November 6, although [wiki:Performance the
Performance page] says November 9), and removing pycrypto on 2007-12-03
seems to have had almost no benefit in reducing memory footprint.
This reminds me of the weirdness about the 64-bit version using way more
memory than we expected.
Hm. I think maybe we are erring by using "VmSize" (from /proc/*/status)
as our proxy for memory usage. That number is the total size of the
virtual address space requested by the process, if I understand correctly.
So for example, mmap'ing a file adds the file's size to your VmSize,
although it does not (by itself) use any memory.
Linux kernel hackers seem to be in universal agreement that it is a bad
idea to use VmSize for anything:
http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html
http://lwn.net/Articles/230975/
But what's the alternative? We could read "smaps" and see if we can get a
better metric out of that.
By the way, if anyone wants to investigate more closely the memory usage,
the valgrind tool named massif has been rewritten so maybe it will work
this time.
--
Comment (by warner):
I took a quick look at smem today, seems pretty nice. I think the "USS"
(Unique Set Size) might be a good thing to track: it's the amount of
memory you'd get back by killing the process. For Tahoe, the main thing we
care about is that the client process isn't leaking or over-allocating the
memory used to hold files during the upload/download process, and that
memory isn't going to be shared with any other process. So even if it
doesn't answer the "can I fit this tahoe node/workload on my NN-MB
computer", it *does* answer the question of whether we're meeting our
memory-complexity design goals.
Installing `smem` requires a bunch of other stuff (python-gtk2, python-tk,
matplotlib), since it has a graphical mode that we don't care about, but
that's not a big deal. There's a process-filter thing which I can't find
documentation on, which we'd need to limit the output to the tahoe
client's own PID. And then the main downside I can think of is that you
have to shell out to a not-small python program for each sample (vs
reading /proc/self/status, which is basically free), so somebody might be
worried about the performance impact.
--
Ticket URL: <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/227#comment:10>
Tahoe-LAFS <https://Tahoe-LAFS.org>
secure decentralized storage
More information about the tahoe-lafs-trac-stream
mailing list