On Mon, 30 Oct 2023 at 09:07, Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> wrote: > > On Sat, 28 Oct 2023 at 13:12, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > On Fri, 27 Oct 2023 at 12:57, Naresh Kamboju <naresh.kamboju@xxxxxxxxxx> wrote: > > > > > > On Thu, 26 Oct 2023 at 21:09, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > > > > > > > On Thu, 26 Oct 2023 at 17:30, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > > > > > > > > > On Thu, Oct 26, 2023 at 08:11:26PM +0530, Naresh Kamboju wrote: > > > > > > Following kernel crash noticed on qemu-arm64 while running LTP syscalls > > > > > > set_robust_list test case running Linux next 6.6.0-rc7-next-20231026 ... > > > > > It looks like this is fallout from the LPA2 enablement. > > > > > > > > > > According to the latest ARM ARM (ARM DDI 0487J.a), page D19-6475, that "unknown > > > > > 43" (0x2b / 0b101011) is the DFSC for a level -1 translation fault: > > > > > > > > > > 0b101011 When FEAT_LPA2 is implemented: > > > > > Translation fault, level -1. > > > > > > > > > > It's triggered here by an LDTR in a get_user() on a bogus userspace address. > > > > > The exception is expected, and it's supposed to be handled via the exception > > > > > fixups, but the LPA2 patches didn't update the fault_info table entries for all > > > > > the level -1 faults, and so those all get handled by do_bad() and don't call > > > > > fixup_exception(), causing them to be fatal. > > > > > > > > > > It should be relatively simple to update the fault_info table for the level -1 > > > > > faults, but given the other issues we're seeing I think it's probably worth > > > > > dropping the LPA2 patches for the moment. > > > > > > > > > > > > > Thanks for the analysis Mark. > > > > > > > > I agree that this should not be difficult to fix, but given the other > > > > CI problems and identified loose ends, I am not going to object to > > > > dropping this partially or entirely at this point. I'm sure everybody > > > > will be thrilled to go over those 60 patches again after I rebase them > > > > onto v6.7-rc1 :-) > > > > > > I am happy to test any proposed fix patch. > > > > > > > Thanks Naresh. Patch attached. > > This patch did not solve the reported problem. > Test log links, > - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/naresh/tests/2XTP1lXcUUscT357YaAm2G1AhpS > Oops, sorry about that. Fixed patch attched.
From 97dea432bceadfcece84484609374c277afc2c81 Mon Sep 17 00:00:00 2001 From: Ard Biesheuvel <ardb@xxxxxxxxxx> Date: Sat, 28 Oct 2023 09:40:29 +0200 Subject: [PATCH v2] Add missing ESR decoding for level -1 translation faults Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx> --- arch/arm64/mm/fault.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 2e5d1e238af9..13f192691060 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -780,18 +780,18 @@ static const struct fault_info fault_info[] = { { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 1 translation fault" }, { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 2 translation fault" }, { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 3 translation fault" }, - { do_bad, SIGKILL, SI_KERNEL, "unknown 8" }, + { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 0 access flag fault" }, { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 1 access flag fault" }, { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 access flag fault" }, { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 access flag fault" }, - { do_bad, SIGKILL, SI_KERNEL, "unknown 12" }, + { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 0 permission fault" }, { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 1 permission fault" }, { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 permission fault" }, { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 permission fault" }, { do_sea, SIGBUS, BUS_OBJERR, "synchronous external abort" }, { do_tag_check_fault, SIGSEGV, SEGV_MTESERR, "synchronous tag check fault" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 18" }, - { do_bad, SIGKILL, SI_KERNEL, "unknown 19" }, + { do_sea, SIGKILL, SI_KERNEL, "level -1 (translation table walk)" }, { do_sea, SIGKILL, SI_KERNEL, "level 0 (translation table walk)" }, { do_sea, SIGKILL, SI_KERNEL, "level 1 (translation table walk)" }, { do_sea, SIGKILL, SI_KERNEL, "level 2 (translation table walk)" }, @@ -799,7 +799,7 @@ static const struct fault_info fault_info[] = { { do_sea, SIGBUS, BUS_OBJERR, "synchronous parity or ECC error" }, // Reserved when RAS is implemented { do_bad, SIGKILL, SI_KERNEL, "unknown 25" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 26" }, - { do_bad, SIGKILL, SI_KERNEL, "unknown 27" }, + { do_sea, SIGKILL, SI_KERNEL, "level -1 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented { do_sea, SIGKILL, SI_KERNEL, "level 0 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented { do_sea, SIGKILL, SI_KERNEL, "level 1 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented { do_sea, SIGKILL, SI_KERNEL, "level 2 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented @@ -813,9 +813,9 @@ static const struct fault_info fault_info[] = { { do_bad, SIGKILL, SI_KERNEL, "unknown 38" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 39" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 40" }, - { do_bad, SIGKILL, SI_KERNEL, "unknown 41" }, + { do_bad, SIGKILL, SI_KERNEL, "level -1 address size fault" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 42" }, - { do_bad, SIGKILL, SI_KERNEL, "unknown 43" }, + { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level -1 translation fault" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 44" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 45" }, { do_bad, SIGKILL, SI_KERNEL, "unknown 46" }, -- 2.42.0.820.g83a721a137-goog