On Mon 15-08-16 12:25:10, Robert Foss wrote: > > > On 2016-08-15 09:42 AM, Michal Hocko wrote: [...] > > The use case is to speed up monitoring of > > memory consumption in environments where RSS isn't precise. > > > > For example Chrome tends to many processes which have hundreds of VMAs > > with a substantial amount of shared memory, and the error of using > > RSS rather than PSS tends to be very large when looking at overall > > memory consumption. PSS isn't kept as a single number that's exported > > like RSS, so to calculate PSS means having to parse a very large smaps > > file. > > > > This process is slow and has to be repeated for many processes, and we > > found that the just act of doing the parsing was taking up a > > significant amount of CPU time, so this patch is an attempt to make > > that process cheaper. Well, this is slow because it requires the pte walk otherwise you cannot know how many ptes map the particular shared page. Your patch (totmaps_proc_show) does the very same page table walk because in fact it is unavoidable. So what exactly is the difference except for the userspace parsing which is quite trivial e.g. my currently running Firefox has $ awk '/^[0-9a-f]/{print}' /proc/4950/smaps | wc -l 984 quite some VMAs, yet parsing it spends basically all the time in the kernel... $ /usr/bin/time -v awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}' /proc/4950/smaps rss:1112288 pss:1096435 Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss} /proc/4950/smaps" User time (seconds): 0.00 System time (seconds): 0.02 Percent of CPU this job got: 91% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.02 So I am not really sure I see the performance benefit. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html