Re: qemu-arm64: handle_futex_death - kernel/futex/core.c:661 - Unable to handle kernel unknown 43 at virtual address

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 26, 2023 at 08:11:26PM +0530, Naresh Kamboju wrote:
> Following kernel crash noticed on qemu-arm64 while running LTP syscalls
> set_robust_list test case running Linux next 6.6.0-rc7-next-20231026 and
> 6.6.0-rc7-next-20231025.
> 
> BAD: next-20231025
> Good: next-20231024
> 
> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
> Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
> 
> Log:
> ----
> <1>[  203.119139] Unable to handle kernel unknown 43 at virtual
> address 0001ffff9e2e7d78
> <1>[  203.119838] Mem abort info:
> <1>[  203.120064]   ESR = 0x000000009793002b
> <1>[  203.121040]   EC = 0x25: DABT (current EL), IL = 32 bits
> set_robust_list01    1  TPASS  :  set_robust_list: retval = -1
> (expected -1), errno = 22 (expected 22)
> set_robust_list01    2  TPASS  :  set_robust_list: retval = 0
> (expected 0), errno = 0 (expected 0)
> <1>[  203.124496]   SET = 0, FnV = 0
> <1>[  203.124778]   EA = 0, S1PTW = 0
> <1>[  203.125029]   FSC = 0x2b: unknown 43

It looks like this is fallout from the LPA2 enablement.

According to the latest ARM ARM (ARM DDI 0487J.a), page D19-6475, that "unknown
43" (0x2b / 0b101011) is the DFSC for a level -1 translation fault:

	0b101011 When FEAT_LPA2 is implemented:
		 Translation fault, level -1.

It's triggered here by an LDTR in a get_user() on a bogus userspace address.
The exception is expected, and it's supposed to be handled via the exception
fixups, but the LPA2 patches didn't update the fault_info table entries for all
the level -1 faults, and so those all get handled by do_bad() and don't call
fixup_exception(), causing them to be fatal.

It should be relatively simple to update the fault_info table for the level -1
faults, but given the other issues we're seeing I think it's probably worth
dropping the LPA2 patches for the moment.

Mark.

> <1>[  203.126470] Data abort info:
> <1>[  203.126710]   Access size = 4 byte(s)
> <1>[  203.126969]   SSE = 0, SRT = 19
> <1>[  203.127708]   SF = 0, AR = 0
> <1>[  203.128213]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> <1>[  203.128788]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> <1>[  203.130416] user pgtable: 4k pages, 52-bit VAs, pgdp=000000010606a780
> <1>[  203.130817] [0001ffff9e2e7d78] pgd=0000000000000000
> <0>[  203.132603] Internal error: Oops: 000000009793002b [#1] PREEMPT SMP
> <4>[  203.133483] Modules linked in: btrfs blake2b_generic libcrc32c
> xor xor_neon raid6_pq zstd_compress crct10dif_ce sm3_ce sm3 sha3_ce
> sha512_ce sha512_arm64 fuse drm backlight dm_mod ip_tables x_tables
> <4>[  203.135177] CPU: 1 PID: 653 Comm: set_robust_list Not tainted
> 6.6.0-rc7-next-20231026 #1
> <4>[  203.135642] Hardware name: linux,dummy-virt (DT)
> <4>[  203.136609] pstate: 83400009 (Nzcv daif +PAN -UAO +TCO +DIT
> -SSBS BTYPE=--)
> <4>[ 203.137028] pc : handle_futex_death (kernel/futex/core.c:661
> (discriminator 6))
> <4>[ 203.138844] lr : handle_futex_death
> (arch/arm64/include/asm/uaccess.h:46 (discriminator 1)
> kernel/futex/core.c:661 (discriminator 1))
> <4>[  203.139132] sp : ffff8000805c3c10
> <4>[  203.139356] x29: ffff8000805c3c10 x28: 0000ffffbf187740 x27:
> d53bd04035000220
> <4>[  203.140366] x26: 0000000000000000 x25: fff00000c6195280 x24:
> fff00000c6195280
> <4>[  203.141055] x23: 0000000000000001 x22: ffffa4e6aeef09d0 x21:
> 0001ffff9e2e7d78
> <4>[  203.141771] x20: 0001ffff9e2e7d78 x19: 0001ffff9e2e7d78 x18:
> ffff8000805c3cf8
> <4>[  203.142457] x17: 0000000000000000 x16: ffffa4e6aeae7078 x15:
> 000000000000000a
> <4>[  203.143134] x14: 0000000000000000 x13: 1ffe000018258661 x12:
> ffff8000805c3cf8
> <4>[  203.143809] x11: 0000000000000000 x10: fff00000c12c3308 x9 :
> ffffa4e6ad0e5748
> <4>[  203.144504] x8 : ffff8000805c3c38 x7 : 0000000000000000 x6 :
> 0000000000000001
> <4>[  203.145186] x5 : 0000000000000000 x4 : fff00000c6195280 x3 :
> 0000000000000000
> <4>[  203.145929] x2 : 0000000000000000 x1 : 000ffffffffffffc x0 :
> 0001ffff9e2e7d78
> <4>[  203.147032] Call trace:
> <4>[ 203.147254] handle_futex_death (kernel/futex/core.c:661 (discriminator 6))
> <4>[ 203.147560] exit_robust_list (kernel/futex/core.c:828)
> <4>[ 203.148348] futex_exit_release (kernel/futex/core.c:1035
> (discriminator 1) kernel/futex/core.c:1131 (discriminator 1))
> <4>[ 203.148891] exit_mm_release (kernel/fork.c:1657)
> <4>[ 203.149669] do_exit (kernel/exit.c:541 kernel/exit.c:858)
> <4>[ 203.149897] do_group_exit (kernel/exit.c:1002)
> <4>[ 203.150209] __arm64_sys_exit_group (kernel/exit.c:1032)
> <4>[ 203.150980] invoke_syscall (arch/arm64/include/asm/current.h:19
> arch/arm64/kernel/syscall.c:56)
> <4>[ 203.151234] el0_svc_common.constprop.0
> (include/linux/thread_info.h:127 (discriminator 2)
> arch/arm64/kernel/syscall.c:144 (discriminator 2))
> <4>[ 203.151999] do_el0_svc (arch/arm64/kernel/syscall.c:156)
> <4>[ 203.152231] el0_svc (arch/arm64/include/asm/daifflags.h:28
> arch/arm64/kernel/entry-common.c:133
> arch/arm64/kernel/entry-common.c:144
> arch/arm64/kernel/entry-common.c:679)
> <4>[ 203.152936] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:697)
> <4>[ 203.153518] el0t_64_sync (arch/arm64/kernel/entry.S:595)
> <0>[ 203.154424] Code: d50323bf d65f03c0 9248fa93 52800002 (b8400a73)
> All code
> ========
>    0: d50323bf autiasp
>    4: d65f03c0 ret
>    8: 9248fa93 and x19, x20, #0xff7fffffffffffff
>    c: 52800002 mov w2, #0x0                    // #0
>   10:* b8400a73 ldtr w19, [x19] <-- trapping instruction
> 
> Code starting with the faulting instruction
> ===========================================
>    0: b8400a73 ldtr w19, [x19]
> <4>[  203.155308] ---[ end trace 0000000000000000 ]---
> <1>[  203.156234] Fixing recursive fault but reboot is needed!
> <3>[  203.157116] BUG: using smp_processor_id() in preemptible
> [00000000] code: set_robust_list/653
> <4>[ 203.158116] caller is debug_smp_processor_id (lib/smp_processor_id.c:61)
> <4>[  203.158983] CPU: 1 PID: 653 Comm: set_robust_list Tainted: G
>  D            6.6.0-rc7-next-20231026 #1
> <4>[  203.159451] Hardware name: linux,dummy-virt (DT)
> <4>[  203.159990] Call trace:
> <4>[ 203.160394] dump_backtrace (arch/arm64/kernel/stacktrace.c:235)
> <4>[ 203.160625] show_stack (arch/arm64/kernel/stacktrace.c:242)
> <4>[ 203.160854] dump_stack_lvl (lib/dump_stack.c:107)
> <4>[ 203.161869] dump_stack (lib/dump_stack.c:114)
> <4>[ 203.162093] check_preemption_disabled
> (arch/arm64/include/asm/current.h:19
> arch/arm64/include/asm/preempt.h:54 lib/smp_processor_id.c:53)
> <4>[ 203.162898] debug_smp_processor_id (lib/smp_processor_id.c:61)
> <4>[ 203.163176] __schedule (kernel/sched/core.c:6578 (discriminator 1))
> <4>[ 203.163894] do_task_dead (kernel/sched/core.c:6705)
> <4>[ 203.164143] make_task_dead
> (arch/arm64/include/asm/atomic_ll_sc.h:95 (discriminator 3)
> arch/arm64/include/asm/atomic.h:49 (discriminator 3)
> include/linux/atomic/atomic-arch-fallback.h:747 (discriminator 3)
> include/linux/atomic/atomic-instrumented.h:253 (discriminator 3)
> include/linux/refcount.h:193 (discriminator 3)
> include/linux/refcount.h:250 (discriminator 3)
> include/linux/refcount.h:267 (discriminator 3) kernel/exit.c:979
> (discriminator 3))
> <4>[ 203.164871] die (arch/arm64/kernel/traps.c:239)
> <4>[ 203.165093] die_kernel_fault (arch/arm64/mm/fault.c:321)
> <4>[ 203.165905] do_mem_abort (arch/arm64/mm/fault.c:850)
> <4>[ 203.166149] el1_abort (arch/arm64/include/asm/daifflags.h:28
> arch/arm64/kernel/entry-common.c:399)
> <4>[ 203.166864] el1h_64_sync_handler (arch/arm64/kernel/entry-common.c:486)
> <4>[ 203.167173] el1h_64_sync (arch/arm64/kernel/entry.S:590)
> <4>[ 203.167824] handle_futex_death (kernel/futex/core.c:661 (discriminator 6))
> <4>[ 203.168329] exit_robust_list (kernel/futex/core.c:828)
> <4>[ 203.168829] futex_exit_release (kernel/futex/core.c:1035
> (discriminator 1) kernel/futex/core.c:1131 (discriminator 1))
> <4>[ 203.169375] exit_mm_release (kernel/fork.c:1657)
> <4>[ 203.169884] do_exit (kernel/exit.c:541 kernel/exit.c:858)
> <4>[ 203.170372] do_group_exit (kernel/exit.c:1002)
> <4>[ 203.170857] __arm64_sys_exit_group (kernel/exit.c:1032)
> <4>[ 203.171643] invoke_syscall (arch/arm64/include/asm/current.h:19
> arch/arm64/kernel/syscall.c:56)
> <4>[ 203.172281] el0_svc_common.constprop.0
> (include/linux/thread_info.h:127 (discriminator 2)
> arch/arm64/kernel/syscall.c:144 (discriminator 2))
> <4>[ 203.172815] do_el0_svc (arch/arm64/kernel/syscall.c:156)
> <4>[ 203.173284] el0_svc (arch/arm64/include/asm/daifflags.h:28
> arch/arm64/kernel/entry-common.c:133
> arch/arm64/kernel/entry-common.c:144
> arch/arm64/kernel/entry-common.c:679)
> <4>[ 203.173769] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:697)
> <4>[ 203.174052] el0t_64_sync (arch/arm64/kernel/entry.S:595)
> 
> 
> 
> Links:
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20231026/testrun/20823098/suite/log-parser-test/test/check-kernel-bug/log
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20231026/testrun/20823098/suite/log-parser-test/tests/
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20231026/testrun/20823050/suite/log-parser-test/tests/
> 
> --
> Linaro LKFT
> https://lkft.linaro.org



[Index of Archives]     [Linux Kernel]     [Linux USB Development]     [Yosemite News]     [Linux SCSI]

  Powered by Linux