Re: [PATCH v2 1/1] mm: smaps: split PSS into components

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 30, 2019 at 11:23 PM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>
> On Fri 31-05-19 08:22:06, Michal Hocko wrote:
> > On Fri 31-05-19 08:04:01, Michal Hocko wrote:
> > > [Please always Cc linux-api mailing list (now added) when adding a new
> > > user visible API. Keeping the rest of the email intact for reference]
> > >
> > > On Thu 30-05-19 17:26:33, semenzato@xxxxxxxxxxxx wrote:
> > > > From: Luigi Semenzato <semenzato@xxxxxxxxxxxx>
> > > >
> > > > Report separate components (anon, file, and shmem)
> > > > for PSS in smaps_rollup.
> > > >
> > > > This helps understand and tune the memory manager behavior
> > > > in consumer devices, particularly mobile devices.  Many of
> > > > them (e.g. chromebooks and Android-based devices) use zram
> > > > for anon memory, and perform disk reads for discarded file
> > > > pages.  The difference in latency is large (e.g. reading
> > > > a single page from SSD is 30 times slower than decompressing
> > > > a zram page on one popular device), thus it is useful to know
> > > > how much of the PSS is anon vs. file.
> >
> > Could you describe how exactly are those new counters going to be used?

Yes.  We wish to gather stats of memory usage by groups of processes
on chromebooks: various types of chrome processes, android processes
(for ARC++, i.e. android running on Chrome OS), VMs, daemons etc.  See

https://chromium.googlesource.com/chromiumos/platform2/+/refs/heads/master/metrics/pgmem.cc

and related files. The stats help us tune the memory manager better in
different scenarios.  Without this patch we only have a global
proportional RSS, but splitting into components help us deal with
situations such as a varying ratio of file vs. anon pages, which can
result, for instance, by starting/stopping android.  (In theory the
"swappiness" tunable should help with that, but it doesn't seem
effective under extreme pressure, which is unfortunately rather common
on these consumer devices).

On older kernels, which we have to support for several years, we've
added an equivalent "totmaps" locally and we'd be super-happy if going
forward we can just switch to smaps_rollup.

> > I do not expect this to add a visible penalty to users who are not going
> > to use the counter but have you tried to measure that?

Right, if smaps or smaps_rollup is not used, this cannot have a
measurable impact (maybe more code->more TLB misses, but that's at
most tiny), so no, I haven't tried to measure that.

I have been measuring the cost of smaps_rollup for all processes in a
chromebook under load (about 400 processes) but those measurements are
too noisy to show change.

The code is shared between smaps and smaps_rollup, and some of the
results aren't used in smaps, only in smaps_rollup, so there's some
waste (a couple of extra conditional branches, and loads/stores), but
again I didn't think that reducing it is worth the trouble in terms of
code complexity.

> Also forgot to mention that any change to smaps should be documented in
> Documentation/filesystems/proc.txt.

Thank you, I'll fix that and send a v3 (and Cc linux-api).


> --
> Michal Hocko
> SUSE Labs



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux