Hi Harshvardhan, [ Cc cgroups@xxxxxxxxxxxxxxx: FYI problem in recent kernel using cgroup v1 ] > Kind regards, > Petr > > Hi there, > > I saw your name appear the most in the commit log of memcg_stat_rss.sh so I was wondering if you had any information as to why this is happening. I feel that we have enough reason to believe that this is due to outdated testcases. It’ll be highly appreciated if you could verify this fact. > > Thanks & Regards, > > Harshvardhan > > From: ltp <ltp-bounces+harshvardhan.j.jha=oracle.com@xxxxxxxxxxxxxx> on behalf of Harshvardhan Jha via ltp <ltp@xxxxxxxxxxxxxx> > > Date: Thursday, 28 November 2024 at 3:20 PM > > To: ltp@xxxxxxxxxxxxxx <ltp@xxxxxxxxxxxxxx> > > Subject: [LTP] Issue faced in memcg_stat_rss while running mainline kernels between 6.7 and 6.8 > > Hi there, > > I've been getting test failures on the memcg_stat_rss testcase for > > mainline 6.12 kernels with 3 tests failing and one being broken. > > Running tests....... > > <<<test_start>>> > > tag=memcg_stat_rss stime=1732003500 > > cmdline="memcg_stat_rss.sh" > > contacts="" > > analysis=exit > > <<<test_output>>> > > incrementing stop > > memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh > > memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp > > 6.12.0-master.20241021.el9.v1.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Oct 21 > > 06:24:22 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux > > memcg_stat_rss 1 TINFO: Using > > /tempdir/ltp-Y4AEUmKVIE/LTP_memcg_stat_rss.kEhD0QvvMw as tmpdir (xfs > > filesystem) > > memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s > > memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy > > to 0 failed > > memcg_stat_rss 1 TINFO: Setting shmmax > > memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240 > > memcg_stat_rss 1 TINFO: Warming up pid: 9367 > > memcg_stat_rss 1 TINFO: Process is still here after warm up: 9367 > > memcg_stat_rss 1 TFAIL: rss is 0, 266240 expected > > memcg_stat_rss 2 TINFO: Running memcg_process --mmap-file -s 4096 > > memcg_stat_rss 2 TINFO: Warming up pid: 9383 > > memcg_stat_rss 2 TINFO: Process is still here after warm up: 9383 > > memcg_stat_rss 2 TPASS: rss is 0 as expected > > memcg_stat_rss 3 TINFO: Running memcg_process --shm -k 3 -s 4096 > > memcg_stat_rss 3 TINFO: Warming up pid: 9446 > > memcg_stat_rss 3 TINFO: Process is still here after warm up: 9446 > > memcg_stat_rss 3 TPASS: rss is 0 as expected > > memcg_stat_rss 4 TINFO: Running memcg_process --mmap-anon --mmap-file > > --shm -s 266240 > > memcg_stat_rss 4 TINFO: Warming up pid: 9462 > > memcg_stat_rss 4 TINFO: Process is still here after warm up: 9462 > > memcg_stat_rss 4 TPASS: rss is 266240 as expected > > memcg_stat_rss 5 TINFO: Running memcg_process --mmap-lock1 -s 266240 > > memcg_stat_rss 5 TINFO: Warming up pid: 9479 > > memcg_stat_rss 5 TINFO: Process is still here after warm up: 9479 > > memcg_stat_rss 5 TFAIL: rss is 0, 266240 expected > > memcg_stat_rss 6 TINFO: Running memcg_process --mmap-anon -s 266240 > > memcg_stat_rss 6 TINFO: Warming up pid: 9495 > > memcg_stat_rss 6 TINFO: Process is still here after warm up: 9495 > > memcg_stat_rss 6 TFAIL: rss is 0, 266240 expected > > memcg_stat_rss 6 TBROK: timed out on memory.usage_in_bytes 4096 266240 > > 266240 > > /opt/ltp-20240930/testcases/bin/tst_test.sh: line 158: 9495 > > Killed memcg_process "$@" (wd: > > /sys/fs/cgroup/memory/ltp/test-9308/ltp_9308) > > Summary: > > passed 3 > > failed 3 > > broken 1 > > skipped 0 > > warnings 0 > > <<<execution_status>>> > > initiation_status="ok" > > duration=17 termination_type=exited termination_id=3 corefile=no > > cutime=13 cstime=58 > > <<<test_end>>> > > INFO: ltp-pan reported some tests FAIL > > LTP Version: 20240930 > > I'm not sure whether this error is due to the kernel or the testcase > > being outdated. I know that since cgroup v2 is the default upstream and > > cgroup v1 is now a legacy option, this specific testcase is not Yes, exactly. I have system with cgroup v1, but it's based on 4.12.14. Even old Debian VM with old 5.10 uses cgroup v2. Therefore I have no change to debug the problem. > > particularly higher in the priority list, but just to be sure, I wanted > > to verify this from your side. Please let me know whether this error is > > coming due to the testcase being outdated or this in fact is a valid > > kernel error. > > I ran a bisect on memcg_stat_rss test upon mainline kernels and saw the > > bisect range narrow down between 6.7 and 6.8 which further isolated to: > > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7d7ef0a4686abe43cd76a141b340a348f45ecdf2__;!!ACWV5N9M2RV99hQ!Ky0mM2XEGFSiCbcBvjP5FV5IV3kGpDuDEhuFVAGVdD1mXLQPidRcZLqH8k0AFxScjZgYnjCgaCISEgDVlcn4BSoj$<https://urldefense.com/v3/__https:/git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=7d7ef0a4686abe43cd76a141b340a348f45ecdf2__;!!ACWV5N9M2RV99hQ!Ky0mM2XEGFSiCbcBvjP5FV5IV3kGpDuDEhuFVAGVdD1mXLQPidRcZLqH8k0AFxScjZgYnjCgaCISEgDVlcn4BSoj$> This was a reason to Cc cgroups@xxxxxxxxxxxxxxx. > > This commit was part of a 5 patch series and I wasn't able to revert it > > on 6.12 without getting a series of conflicts. > > So, what I did was checkout the SHA before this patch series > > 4a3bfbd1699e2306731809d50d480634012ed4de and after the patch series > > 7d7ef0a4686abe43cd76a141b340a348f45ecdf2 and ran this test. > > The machine had 32GB Ram and 4CPUs. > > The steps to reproduce this are: > > #!/bin/bash > > # After setting default kernel to the desired one > > if ! grep -q "unified_cgroup_hierarchy=0" /proc/cmdline; then > > sudo grubby --update-kernel DEFAULT > > --args="systemd.unified_cgroup_hierarchy=0" > > sudo grubby --update-kernel DEFAULT > > --args="systemd.legacy_systemd_cgroup_controller" > > sudo grubby --update-kernel DEFAULT --args selinux=0 > > sudo sed -i "/^SELINUX=/s/=.*/=disabled/" /etc/selinux/config > > sudo reboot > > fi > > cd /opt/ltp > > rm -rf /tmpdir > > mkdir /tempdir > > ./runltp -d /tempdir -s memcg_stat_rss Or just: # PATH="/opt/ltp/testcases/bin:$PATH" memcg_stat_rss.sh Kind regards, Petr > > The results obtained were: > > Pre bisect culprit (4a3bfbd1699e2306731809d50d480634012ed4de): > > <<<test_start>>> > > tag=memcg_stat_rss stime=1731754078 > > cmdline="memcg_stat_rss.sh" > > contacts="" > > analysis=exit > > <<<test_output>>> > > incrementing stop > > memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh > > memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp > > 6.7.0-masterpre.2024111.el9.rc1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov 15 > > 11:56:10 PST 2024 x86_64 x86_64 x86_64 GNU/Linux > > memcg_stat_rss 1 TINFO: Using > > /tempdir/ltp-SzE9ADK6MM/LTP_memcg_stat_rss.6op28sMXO2 as tmpdir (xfs > > filesystem) > > memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s > > memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy > > to 0 failed > > memcg_stat_rss 1 TINFO: Setting shmmax > > memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240 > > memcg_stat_rss 1 TINFO: Warming up pid: 34237 > > memcg_stat_rss 1 TINFO: Process is still here after warm up: 34237 > > memcg_stat_rss 1 TPASS: rss is 266240 as expected > > memcg_stat_rss 1 TBROK: timed out on memory.usage_in_bytes 4096 266240 > > 266240 > > /opt/ltp-20240930/testcases/bin/tst_test.sh: line 158: 34237 > > Killed memcg_process "$@" (wd: > > /sys/fs/cgroup/memory/ltp/test-34180/ltp_34180) > > Summary: > > passed 1 > > failed 0 > > broken 1 > > skipped 0 > > warnings 0 > > <<<execution_status>>> > > Post bisect culprit(7d7ef0a4686abe43cd76a141b340a348f45ecdf2): > > <<<test_start>>> > > tag=memcg_stat_rss stime=1731755339 > > cmdline="memcg_stat_rss.sh" > > contacts="" > > analysis=exit > > <<<test_output>>> > > incrementing stop > > memcg_stat_rss 1 TINFO: Running: memcg_stat_rss.sh > > memcg_stat_rss 1 TINFO: Tested kernel: Linux harjha-ol9kdevltp > > 6.7.0-masterpost.2024111.el9.rc1.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Nov > > 15 11:55:41 PST 2024 x86_64 x86_64 x86_64 GNU/Linux > > memcg_stat_rss 1 TINFO: Using > > /tempdir/ltp-G6cge4CkrR/LTP_memcg_stat_rss.1zrm6X02CO as tmpdir (xfs > > filesystem) > > memcg_stat_rss 1 TINFO: timeout per run is 0h 5m 0s > > memcg_stat_rss 1 TINFO: set /sys/fs/cgroup/memory/memory.use_hierarchy > > to 0 failed > > memcg_stat_rss 1 TINFO: Setting shmmax > > memcg_stat_rss 1 TINFO: Running memcg_process --mmap-anon -s 266240 > > memcg_stat_rss 1 TINFO: Warming up pid: 9083 > > memcg_stat_rss 1 TINFO: Process is still here after warm up: 9083 > > memcg_stat_rss 1 TFAIL: rss is 0, 266240 expected > > memcg_stat_rss 1 TBROK: timed out on memory.usage_in_bytes 4096 266240 > > 266240 > > /opt/ltp-20240930/testcases/bin/tst_test.sh: line 158: 9083 > > Killed memcg_process "$@" (wd: > > /sys/fs/cgroup/memory/ltp/test-9024/ltp_9024) > > Summary: > > passed 0 > > failed 1 > > broken 1 > > skipped 0 > > warnings 0 > > <<<execution_status>>> > > Thanks & Regards, > > Harshvardhan