On 10/01/2024 23:21, John Hubbard wrote: > On 1/10/24 09:32, Ryan Roberts wrote: > ... >> options: >> -h, --help show this help message and exit >> --pid pid Process id of the target process. Exactly one >> of --pid and --cgroup must be provided. >> --cgroup path Path to the target cgroup in sysfs. Iterates >> over every pid in the cgroup and its children. >> Get global stats by passing in the root cgroup > > Hi Ryan, > > Yes, this version is fairly effective at getting global stats now. > > I've got some proposed minor tweaks below, and a few questions. Let me > start with the questions: > > 1) When I run this on an older 6.4.8-based kernel: > > # ./thpmaps --cgroup /sys/fs/cgroup --cont 128K --cont 512K --cont 1M \ > --cont 2M --cont 512M --summary > > , I get this output: > > file-thp-aligned-524288kB: 36175872 kB (95%) > file-thp-partial: 856640 kB ( 2%) > file-cont-aligned-128kB: 37032320 kB (97%) > file-cont-aligned-512kB: 36597760 kB (96%) > file-cont-aligned-1024kB: 36597760 kB (96%) > file-cont-aligned-2048kB: 36595712 kB (96%) > file-cont-aligned-524288kB: 36175872 kB (95%) > > > Is it true that the above is basically "normal" 512MB THP in action? No: the "file" part of the counter name means it is file (not anon). So this is not mTHP, which would always be anon (e.g. "anon-cont-aligned-128kB"). Based on your follow-up mail, I would guess this is mostly hugetlb memory rather than actual page cache memory, but they are both getting lumped into those "file" labels. > And all of the "cont" entries are just that way because we can't > really tell mTHP/cont apart from normal THP? I'm not sure exectly what you are asking. The "cont" counters are counting blocks of contiguous, naturally aligned physical memory, which are also mapped contiguously and aligned. So a smaller --cont would always include all the memory captured in a larger --cont. In this case, its all the *file-backed* memory (as highighted in the label name) so nothing to do with (m)THP. But where you have THP, --cont doesn't care what the underlying THP size is as long as its requirements are met, so PMD-sized THPs would be included in e.g. *anon*-cont-aligned-128kB. Note the the "--cont" counters don't directly count memory that is PTE-mapped with the contiguous bit set in the page table; it just counts memory that meets the alignment, size and mapping requirements. On arm64 systems with the contpte series, the contiguous bit would be used here, but its not a part of what's getting measured. > > 2) On an mTHP kernel with the latest patchsets (arm64, 64K page size), I > *think* I cannot turn off mTHP. I'm still teasing apart how much of this > is an instrumentation error, and how much is a measurement problem (with > the test suite). And maybe I'm wrong entirely. But the "never" option > doesn't seem to have an effect. Unless the latest version of the testsuite > is doing something new, sigh. > > $ for f in $(find /sys/kernel/mm/transparent_hugepage/ -name enabled); do echo > "$f: $(cat $f)"; done > /sys/kernel/mm/transparent_hugepage/hugepages-512kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/enabled: always madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-262144kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-32768kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-1024kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-16384kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-524288kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-8192kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-256kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-65536kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-131072kB/enabled: always inherit > madvise [never] > /sys/kernel/mm/transparent_hugepage/hugepages-4096kB/enabled: always inherit > madvise [never] > > Any quick thoughts? Don't waste any time on this, it's probably > operator error. Just in case, though. As per your email, you're looking at hugetlb memory (as per counter label). I have all the information to create a hugetlb-specific set of counters, so its not lumped in with page cache memory. You would then have counter sets of "anon", "file" and "htlb". Would that be useful? > > >> (e.g. /sys/fs/cgroup for cgroup-v2 or >> /sys/fs/cgroup/pids for cgroup-v1). Exactly one >> of --pid and --cgroup must be provided. > > Maybe we could add "--global" to that list. That would look, in order, > inside cgroups2 and cgroups, for a list of pids, and then run as if > --cgroup /sys/fs/cgroup or --cgroup /sys/fs/cgroup/pids were specified. I think actually it might be better just to make global the default when neither --pid nor --cgroup are provided? And in this case, I'll just grab all the pids from /proc rather than traverse the cgroup hierachy, that way it will work on systems without cgroups. Does that work for you? > > It's nicer than failing out. And it's also directly useful. I would be > running my above command like this, instead: > > # ./thpmaps --global --cont 128K --cont 512K --cont 1M \ > --cont 2M --cont 512M --summary > > thanks,