On Mon, 22 Feb 2021, Greg KH wrote: > On Mon, Feb 22, 2021 at 06:22:44AM -0500, Robert P. J. Day wrote: > > On Thu, 18 Feb 2021, Lennart Poettering wrote: > > > > > On Do, 18.02.21 11:48, Robert P. J. Day (rpjday@xxxxxxxxxxxxxx) wrote: > > > > > > > A colleague has reported the following apparent issue in a fairly > > > > old (v230) version of systemd -- this is in a Yocto Project Wind River > > > > Linux 9 build, hence the age of the package. > > > > > > > > As reported to me (and I'm gathering more info), the system was > > > > being put through some "longevity testing" by repeatedly adding, > > > > removing, activating and de-activating network interfaces. According > > > > to the report, the result was heap space slowly but inexorably being > > > > consumed. > > > > > > > > While waiting for more info, I'm going to examine the commit log for > > > > systemd from v230 moving forward to collect any commits that address > > > > memory leaks, then peruse them more carefully to see if they might > > > > resolve the problem. > > > > > > > > I realize it's asking a bit for folks here to remember that far > > > > back, but does this issue sound at all familiar? Any pointers that > > > > might save me some time? Thanks. > > > > > > Note that our hash tables operate with an allocation cache: when > > > adding entries to them and then removing them again the memory > > > required for that is not returned to the OS but added to a local > > > cache. When the next entry is then added again, we recycle the > > > cached entry instead of asking for new memory again. This allocation > > > cache is a bit quicker then going to malloc() all the time, but > > > means if you just watch the heap you'll assume there's a leak even > > > though there isn't really, the memory is not lost after all, and > > > will be reused eventually if we need it. > > > > > > You may use the env var SYSTEMD_MEMPOOL=0 to turn this logic off, > > > but not sure v230 already knew that env var. > > > > well, we seem to have isolated the issue, here it is in a nutshell > > based on a condensed note i got from someone who tracked it down this > > weekend. the memory leak is triggered by: > > > > $ ssh root@<target> -p 830 -s netconf [830 = netconf over SSH] > > > > long story short, according to jemalloc profiling, there is a massive > > memory leak in DBUS code, to the tune of about 500M/day on a running > > system. i'm perusing the profiling output now, but does any of this > > sound even remotely familiar to anyone? i realize that's just a > > summary, but does anyone remember seeing something related to this > > once upon a time? [heavily-patched systemd_230 from wind river linux > > 9]. > > Given that this is a heavily patched system, please get support from > the vendor that provided this as you are paying for this. Don't ask > the community to try to remember what happened with an old obsolete > version of software, that's crazy... that's already in the pipeline, i was simply asking if anyone had ever *seen* this before, just so we might be able to say, "hey, we're not the first this has happened to." also, on the off-chance that anyone else is using a similarly-dated version of systemd, they might say, "hmmmmm, that sounds suspiciously like what's happening with *us*." just trying to be helpful. rday _______________________________________________ systemd-devel mailing list systemd-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/systemd-devel