On Fri 25-08-17 14:16:37, Andrew Morton wrote: > On Thu, 24 Aug 2017 10:55:53 +0200 Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > If we assume that the number of VMAs is going to increase over time, > > > then doing anything we can do to reduce the overhead of each VMA > > > during PSS collection seems like the right way to go, and that means > > > outputting an aggregate statistic (to avoid whatever overhead there is > > > per line in writing smaps and in reading each line from userspace). > > > > > > Also, Dan sent me some numbers from his benchmark measuring PSS on > > > system_server (the big Android process) using smaps vs smaps_rollup: > > > > > > using smaps: > > > iterations:1000 pid:1163 pss:220023808 > > > 0m29.46s real 0m08.28s user 0m20.98s system > > > > > > using smaps_rollup: > > > iterations:1000 pid:1163 pss:220702720 > > > 0m04.39s real 0m00.03s user 0m04.31s system > > > > I would assume we would do all we can to reduce this kernel->user > > overhead first before considering a new user visible file. I haven't > > seen any attempts except from the low hanging fruid I have tried. > > It's hard to believe that we'll get anything like a 5x speedup via > optimization of the existing code? Maybe we will not get that much of a boost but having misleading numbers really quick is not something we should aim for. Just try to think what the cumulative numbers actually mean. How can you even consider cumulative PSS when you have no idea about mappings that were considered? -- Michal Hocko SUSE Labs