Re: [PATCH 1/1] mm/page_alloc: add scheduling point to free_unref_page_list

wangjianxing <wangjianxing@xxxxxxxxxxx> · Thu, 3 Mar 2022 10:02:45 +0800

On 03/03/2022 07:34 AM, Andrew Morton wrote:
On Tue,  1 Mar 2022 20:38:25 -0500 wangjianxing <wangjianxing@xxxxxxxxxxx> wrote:

free a large list of pages maybe cause rcu_sched starved on
non-preemptible kernels

rcu: rcu_sched kthread starved for 5359 jiffies! g454793 f0x0
RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=19
[...]
Call Trace:
   free_unref_page_list+0x19c/0x270
   release_pages+0x3cc/0x498
   tlb_flush_mmu_free+0x44/0x70
   zap_pte_range+0x450/0x738
   unmap_page_range+0x108/0x240
   unmap_vmas+0x74/0xf0
   unmap_region+0xb0/0x120
   do_munmap+0x264/0x438
   vm_munmap+0x58/0xa0
   sys_munmap+0x10/0x20
   syscall_common+0x24/0x38
Thanks.

How did this large list of pages come about?

Will people be seeing this message in upstream kernels, or is it
specific to some caller code which you have added?

Please always include details such as this so that others can determine
whether the fix should be backported into -stable kernels.
Thanks.

I try to increase the overcommit ratio of cpu to 1:2~1:3 in KVM 
hypervisor, per-vm has the same number of vcpu with host cpu, then setup 
2 or 3 vm.
Run ltpstress test in per vm, both host and guest is non-preemptiable 
kernel, vm dmesg will throw some rcu_sched warning.

ltp version is 20180926, but until now I didn't analysis ltpstress code 
deeply.