Hi On 24.10.2022 07:28, Shakeel Butt wrote: > Currently mm_struct maintains rss_stats which are updated on page fault > and the unmapping codepaths. For page fault codepath the updates are > cached per thread with the batch of TASK_RSS_EVENTS_THRESH which is 64. > The reason for caching is performance for multithreaded applications > otherwise the rss_stats updates may become hotspot for such > applications. > > However this optimization comes with the cost of error margin in the rss > stats. The rss_stats for applications with large number of threads can > be very skewed. At worst the error margin is (nr_threads * 64) and we > have a lot of applications with 100s of threads, so the error margin can > be very high. Internally we had to reduce TASK_RSS_EVENTS_THRESH to 32. > > Recently we started seeing the unbounded errors for rss_stats for > specific applications which use TCP rx0cp. It seems like > vm_insert_pages() codepath does not sync rss_stats at all. > > This patch converts the rss_stats into percpu_counter to convert the > error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2). > However this conversion enable us to get the accurate stats for > situations where accuracy is more important than the cpu cost. Though > this patch does not make such tradeoffs. > > Signed-off-by: Shakeel Butt <shakeelb@xxxxxxxxxx> This patch landed recently in linux-next as commit d59f19a7a068 ("mm: convert mm's rss stats into percpu_counter"). Unfortunately it causes a regression on my test systems. I've noticed that it triggers a 'BUG: Bad rss-counter state' warning from time to time for random processes. This is somehow related to CPU hot-plug and/or system suspend/resume. The easiest way to reproduce this issue (although not always) on my test systems (ARM or ARM64 based) is to run the following commands: root@target:~# for i in /sys/devices/system/cpu/cpu[1-9]; do echo 0 >$i/online; BUG: Bad rss-counter state mm:f04c7160 type:MM_FILEPAGES val:1 BUG: Bad rss-counter state mm:50f1f502 type:MM_FILEPAGES val:2 BUG: Bad rss-counter state mm:50f1f502 type:MM_ANONPAGES val:15 BUG: Bad rss-counter state mm:63660fd0 type:MM_FILEPAGES val:2 BUG: Bad rss-counter state mm:63660fd0 type:MM_ANONPAGES val:15 Let me know if I can help debugging this somehow or testing a fix. > --- > include/linux/mm.h | 26 ++++-------- > include/linux/mm_types.h | 7 +--- > include/linux/mm_types_task.h | 13 ------ > include/linux/percpu_counter.h | 1 - > include/linux/sched.h | 3 -- > include/trace/events/kmem.h | 8 ++-- > kernel/fork.c | 16 +++++++- > mm/memory.c | 73 +++++----------------------------- > 8 files changed, 40 insertions(+), 107 deletions(-) > > ... Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland