Re: [PATCH 3/3] kfence: test: try to avoid test_gfpzero trigger rcu_stall

Shuah Khan <skhan@xxxxxxxxxxxxxxxxxxx> · Wed, 9 Mar 2022 14:39:28 -0700

On 3/8/22 6:47 PM, Peng Liu wrote:
When CONFIG_KFENCE_DYNAMIC_OBJECTS is set to a big number, kfence
kunit-test-case test_gfpzero will eat up nearly all the CPU's
resources and rcu_stall is reported as the following log which is
cut from a physical server.

   rcu: INFO: rcu_sched self-detected stall on CPU
   rcu: 	68-....: (14422 ticks this GP) idle=6ce/1/0x4000000000000002
   softirq=592/592 fqs=7500 (t=15004 jiffies g=10677 q=20019)
   Task dump for CPU 68:
   task:kunit_try_catch state:R  running task
   stack:    0 pid: 9728 ppid:     2 flags:0x0000020a
   Call trace:
    dump_backtrace+0x0/0x1e4
    show_stack+0x20/0x2c
    sched_show_task+0x148/0x170
    ...
    rcu_sched_clock_irq+0x70/0x180
    update_process_times+0x68/0xb0
    tick_sched_handle+0x38/0x74
    ...
    gic_handle_irq+0x78/0x2c0
    el1_irq+0xb8/0x140
    kfree+0xd8/0x53c
    test_alloc+0x264/0x310 [kfence_test]
    test_gfpzero+0xf4/0x840 [kfence_test]
    kunit_try_run_case+0x48/0x20c
    kunit_generic_run_threadfn_adapter+0x28/0x34
    kthread+0x108/0x13c
    ret_from_fork+0x10/0x18

To avoid rcu_stall and unacceptable latency, a schedule point is
added to test_gfpzero.

Signed-off-by: Peng Liu <liupeng256@xxxxxxxxxx>
---
  mm/kfence/kfence_test.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index caed6b4eba94..1b50f70a4c0f 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -627,6 +627,7 @@ static void test_gfpzero(struct kunit *test)
  			kunit_warn(test, "giving up ... cannot get same object back\n");
  			return;
  		}
+		cond_resched();

This sounds like a band-aid - is there a better way to fix this?

  	}
  
  	for (i = 0; i < size; i++)


thanks,
-- Shuah