On Wed, Aug 09, 2017 at 05:15:57PM -0700, Daniel Colascione wrote: > /proc/pid/smaps_rollup is a new proc file that improves the > performance of user programs that determine aggregate memory > statistics (e.g., total PSS) of a process. > > Android regularly "samples" the memory usage of various processes in > order to balance its memory pool sizes. This sampling process involves > opening /proc/pid/smaps and summing certain fields. For very large > processes, sampling memory use this way can take several hundred > milliseconds, due mostly to the overhead of the seq_printf calls in > task_mmu.c. > > smaps_rollup improves the situation. It contains most of the fields of > /proc/pid/smaps, but instead of a set of fields for each VMA, > smaps_rollup instead contains one synthetic smaps-format entry > representing the whole process. In the single smaps_rollup synthetic > entry, each field is the summation of the corresponding field in all > of the real-smaps VMAs. Using a common format for smaps_rollup and > smaps allows userspace parsers to repurpose parsers meant for use with > non-rollup smaps for smaps_rollup, and it allows userspace to switch > between smaps_rollup and smaps at runtime (say, based on the > availability of smaps_rollup in a given kernel) with minimal fuss. > > By using smaps_rollup instead of smaps, a caller can avoid the > significant overhead of formatting, reading, and parsing each of a > large process's potentially very numerous memory mappings. For > sampling system_server's PSS in Android, we measured a 12x speedup, > representing a savings of several hundred milliseconds. > > One alternative to a new per-process proc file would have been > including PSS information in /proc/pid/status. We considered this > option but thought that PSS would be too expensive (by a few orders of > magnitude) to collect relative to what's already emitted as part of > /proc/pid/status, and slowing every user of /proc/pid/status for the > sake of readers that happen to want PSS feels wrong. > > The code itself works by reusing the existing VMA-walking framework we > use for regular smaps generation and keeping the mem_size_stats > structure around between VMA walks instead of using a fresh one for > each VMA. In this way, summation happens automatically. We let > seq_file walk over the VMAs just as it does for regular smaps and just > emit nothing to the seq_file until we hit the last VMA. > > Patch changelog: > > v2: Fix typo in commit message > Add ABI documentation as requested by gregkh > > Signed-off-by: Daniel Colascione <dancol@xxxxxxxxxx> I love this. FYI, there was trial but got failed at that time so in this time, https://marc.info/?l=linux-kernel&m=147310650003277&w=2 http://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1229163.html I really hope we merge this patch.