Re: Crash on booth with 6.10

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/30/24 20:18, Christoph Biedl wrote:
matoro wrote...

Hi all, just bumped to the newest mainline starting with 6.10.2 and
immediately ran into a crash on boot.  Fully reproducible, reverting back to
last known good (6.9.8) resolves the issue.  Any clue what's going on here?
I can provide full boot logs, start bisecting, etc if needed...

Is this supposed to have been fixed in the meantime? Using 6.10.7 from yesterday,
I getting a similar crash:

(...)
[    9.653898] scsi 1:0:5:0: Power-on or device reset occurred
[   12.337213] sd 1:0:5:0: Attached scsi generic sg0 type 0
[   12.343544] sd 1:0:5:0: [sda] 17773524 512-byte logical blocks: (9.10 GB/8.47 GiB)
[   12.352957] sd 1:0:5:0: [sda] Write Protect is off
[   12.359151] sd 1:0:5:0: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   12.379035]  sda: sda1 sda2 sda3
[   12.383562] sd 1:0:5:0: [sda] Attached SCSI disk
[   12.397737] Freeing unused kernel image (initmem) memory: 3072K
[   12.406839] Backtrace:
[   12.409235]  [<1116535c>] kernel_init+0x80/0x1d4
[   12.413911]  [<1040201c>] ret_from_kernel_thread+0x1c/0x24
[   12.419448]
[   12.420970]
[   12.422487] Kernel Fault: Code=26 (Data memory access rights trap) at addr 113c5f90
[   12.430172] CPU: 0 PID: 1 Comm: swapper Not tainted 6.10.7 #1
[   12.435958] Hardware name: 9000/785/C3600
[   12.439997]
[   12.441518]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[   12.446256] PSW: 00000000000001000000000000001111 Not tainted
[   12.452033] r00-03  0004000f 113c9744 105994ac 128942c0
[   12.457295] r04-07  119bda70 1180d4e0 1180d4e0 11822a90
[   12.462555] r08-11  11822a70 112d1000 1180d7ec 00000017
[   12.467817] r12-15  00000000 11a1aa70 113b196c 112d1000
[   12.473077] r16-19  112d1000 ffffffff f0000174 113c723c
[   12.478338] r20-23  00000002 113c9744 113c5a70 000000d0
[   12.483597] r24-27  12892d00 00000000 119bde74 113c5a70
[   12.488859] r28-31  113c5f8c 01a19700 12894300 00000004
[   12.494158] sr00-03  00000000 00000000 00000000 00000000
[   12.499502] sr04-07  00000000 00000000 00000000 00000000
[   12.504850]
[   12.506373] IASQ: 00000000 00000000 IAOQ: 10599508 1059950c
[   12.511980]  IIR: 0f941288    ISR: 00000000  IOR: 113c5f90
[   12.517495]  CPU:        0   CR30: 12892d00 CR31: 11111111
[   12.523016]  ORIG_R28: 55555555
[   12.526185]  IAOQ[0]: jump_label_init_ro+0x98/0xe4
[   12.531014]  IAOQ[1]: jump_label_init_ro+0x9c/0xe4
[   12.535872]  RP(r2): jump_label_init_ro+0x3c/0xe4
[   12.540610] Backtrace:
[   12.543000]  [<1116535c>] kernel_init+0x80/0x1d4
[   12.547654]  [<1040201c>] ret_from_kernel_thread+0x1c/0x24
[   12.553319]
[   12.557345] Kernel panic - not syncing: Kernel Fault

.config is attached, I can dig more in the next days.

I can reproduce.

The crash happens, because in kernel/jump_label.c: jump_label_init_ro(),
this static key is accessed but gives a segfault, because this area is already read-only:
mm/usercopy.c:static DEFINE_STATIC_KEY_FALSE_RO(bypass_usercopy_checks);

This is the only static key in this parisc kernel which is marked with __ro_after_init.
The area is marked read-only in free_initmem() [in arch/parisc/mm/init.c],
which happens before mark_readonly().

So, the issue is basically triggered by this commit:

commit 91a1d97ef482c1e4c9d4c1c656a53b0f6b16d0ed
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date:   Wed Mar 13 19:01:03 2024 +0100
    jump_label,module: Don't alloc static_key_mod for __ro_after_init keys

due to this hunk:

diff --git a/init/main.c b/init/main.c
index 2ca52474d0c3..6c3f251d6ef8 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1408,6 +1408,7 @@ static void mark_readonly(void)
                 * insecure pages which are W+X.
                 */
                flush_module_init_free_work();
+               jump_label_init_ro();
                mark_rodata_ro();
                debug_checkwx();
                rodata_test();

I'm still unsure about the best way to fix it.
Swapping calls to free_initmem() and mark_readonly() fixes it for me:

diff --git a/init/main.c b/init/main.c
index 206acdde51f5..1f82583fd21d 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1473,8 +1473,8 @@ static int __ref kernel_init(void *unused)
        ftrace_free_init_mem();
        kgdb_free_init_mem();
        exit_boot_config();
-       free_initmem();
        mark_readonly();
+       free_initmem();

        /*
         * Kernel mappings are now finalized - update the userspace page-table


Opinions?

Helge





[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux