Hi, guys:
I have built a kernel 4.4.38-rt49 with CONFIG_PREEMPT_RT_FULL=y ,
the kernel crash when I run the UnixBench of spawn test case.
Here is the oops info:
[ 206.143829] BUG: scheduling while atomic: spawn/27356/0x00000002
[ 206.143839] Modules linked in: bcmdhd pci_tegra bluedroid_pm ip_tables
[ 206.143846] CPU: 5 PID: 27356 Comm: spawn Tainted: G W
4.4.38-DATA-RT-g06219d69-dirty #7
[ 206.143848] Hardware name: quill (DT)
[ 206.143850] Call trace:
[ 206.143871] [<ffffffc0000898f0>] dump_backtrace+0x0/0x100
[ 206.143875] [<ffffffc000089ab8>] show_stack+0x14/0x1c
[ 206.143884] [<ffffffc000314120>] dump_stack+0x98/0xc0
[ 206.143902] [<ffffffc00016c330>] __schedule_bug+0x44/0x5c
[ 206.143911] [<ffffffc000afd690>] __schedule+0x418/0x4f4
[ 206.143913] [<ffffffc000afd7b8>] schedule+0x4c/0xe4
[ 206.143918] [<ffffffc000afeeb8>] rt_spin_lock_slowlock+0x194/0x2c4
[ 206.143921] [<ffffffc000b0048c>] rt_spin_lock+0x58/0x5c
[ 206.143926] [<ffffffc0000e4678>] __wake_up+0x20/0x4c
[ 206.143930] [<ffffffc0000e6d58>] __percpu_up_read+0x34/0x3c
[ 206.143939] [<ffffffc0000a2de0>] copy_process.isra.52+0x136c/0x19f0
[ 206.143942] [<ffffffc0000a3590>] _do_fork+0x74/0x39c
[ 206.143945] [<ffffffc0000a3980>] SyS_clone+0x1c/0x24
[ 206.143949] [<ffffffc000084ff0>] el0_svc_naked+0x24/0x28
[ 206.143963] Unable to handle kernel paging request at virtual address
7ff31fc040
[ 206.143964] pgd = ffffffc1df8d7000
[ 206.143985] [7ff31fc040] *pgd=0000000262400003,
*pud=0000000262400003, *pmd=000000025ff4c003, *pte=00e0000255829f3
[ 206.143989] Internal error: Oops: 9200004f [#1] PREEMPT SMP
[ 206.143996] Modules linked in: bcmdhd pci_tegra bluedroid_pm ip_tables
[ 206.143999] CPU: 5 PID: 27356 Comm: spawn Tainted: G W
4.4.38-DATA-RT-g06219d69-dirty #7
[ 206.144000] Hardware name: quill (DT)
[ 206.144002] task: ffffffc1e45bd100 ti: ffffffc1e0320000 task.ti:
ffffffc1e0320000
[ 206.144005] PC is at 0x7f9fb6a198
[ 206.144006] LR is at 0x559517b9b0
[ 206.144008] pc : [<0000007f9fb6a198>] lr : [<000000559517b9b0>]
pstate: 20000000
[ 206.144009] sp : 0000007ff31fc060
[ 206.144013] x29: 0000007ff31fc0a0 x28: 0000000000000000
[ 206.144016] x27: 0000000000000000 x26: 0000000000000000
[ 206.144018] x25: 0000000000000000 x24: 0000000000000000
[ 206.144021] x23: 0000000000000000 x22: 000000000000001e
[ 206.144023] x21: 000000559518b000 x20: 0000007ff31fc094
[ 206.144026] x19: 000000559518c048 x18: 0000000000000003
[ 206.144028] x17: 0000007f9fb6a198 x16: 000000559518bf90
[ 206.144030] x15: 0000007f9fc4c150 x14: 0000000000000008
[ 206.144033] x13: 0000007f9fc2a34c x12: 0000007ff31fbfa0
[ 206.144035] x11: 0000007f9fc4f740 x10: 0000000000000000
[ 206.144038] x9 : 0000007ff31fc128 x8 : 00000000000000dc
[ 206.144040] x7 : 0000007f9fbee088 x6 : 0000007f9fc4fac8
[ 206.144042] x5 : 0000007f9fc40bb0 x4 : 0000007f9fc40c80
[ 206.144045] x3 : 0000000000000000 x2 : 8c391e6c47b6d000
[ 206.144047] x1 : 0000000000000000 x0 : 0000007ff31fc094
[ 206.144048]
[ 206.144050] Process spawn (pid: 27356, stack limit = 0xffffffc1e0320028)
The call path is:
do_fork-> copy_process -> threadgroup_change_end ->
percpu_up_read(call preempt_disable) -> __percpu_up_read
-> wake_up -> rt_spin_lock -> rt_spin_lock_slowlock -> schedule(call
preempt_disable again) -> __schedule
-> schedule_debug -> in_aotmic_preempt_off (return true, preempt_count
== 2) -> __schedule_bug ( leads to kernel pagefault exception, OOPS!!)
Before schedule, we have call preempt_disable twice, this will
definitely bump preempt_count to 2 and
in_atomic_preempt_off will fail.
I did not figure out: WHY we call schedule inside
rt_spin_lock_slowlock and under what condition this call is correct ?
Any ideas ?