Hi John, On Thu, Nov 3, 2022 at 6:25 PM Sergio Paracuellos <sergio.paracuellos@xxxxxxxxx> wrote: > > Hi John, > > Thanks for the patches! > > On Thu, Nov 3, 2022 at 12:15 PM John Thomson > <lists@xxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > On Thu, 3 Nov 2022, at 05:05, John Thomson wrote: > > > Following commit 6edf2576a6cc ("mm/slub: enable debugging memory wasting > > > of kmalloc") mt7621 failed to boot very early, without showing any > > > console messages. > > > This exposed the pre-existing bug of mt7621.c using kzalloc before normal > > > memory management was available. > > > Prior to this slub change, there existed the unintended protection against > > > "kmem_cache *s" being NULL as slab_pre_alloc_hook() happened to > > > return NULL and bailed out of slab_alloc_node(). > > > This allowed mt7621 prom_soc_init to fail in the soc_dev_init kzalloc, > > > but continue booting without this soc device. > > > > > > Console output from a DEBUG_ZBOOT vmlinuz kernel loading, > > > with mm/slub modified to warn on kmem_cache zero or null: > > > > > > zimage at: 80B842A0 810B4BC0 > > > Uncompressing Linux at load address 80001000 > > > Copy device tree to address 80B80EE0 > > > Now, booting the kernel... > > > > > > [ 0.000000] Linux version 6.1.0-rc3+ (john@john) > > > (mipsel-buildroot-linux-gnu-gcc.br_real (Buildroot > > > 2021.11-4428-g6b6741b) 12.2.0, GNU ld (GNU Binutils) 2.39) #73 SMP Wed > > > Nov 2 05:10:01 AEST 2022 > > > [ 0.000000] ------------[ cut here ]------------ > > > [ 0.000000] WARNING: CPU: 0 PID: 0 at mm/slub.c:3416 > > > kmem_cache_alloc+0x5a4/0x5e8 > > > [ 0.000000] Modules linked in: > > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3+ #73 > > > [ 0.000000] Stack : 810fff78 80084d98 00000000 00000004 00000000 > > > 00000000 80889d04 80c90000 > > > [ 0.000000] 80920000 807bd328 8089d368 80923bd3 00000000 > > > 00000001 80889cb0 00000000 > > > [ 0.000000] 00000000 00000000 807bd328 8084bcb1 00000002 > > > 00000002 00000001 6d6f4320 > > > [ 0.000000] 00000000 80c97d3d 80c97d68 fffffffc 807bd328 > > > 00000000 00000000 00000000 > > > [ 0.000000] 00000000 a0000000 80910000 8110a0b4 00000000 > > > 00000020 80010000 80010000 > > > [ 0.000000] ... > > > [ 0.000000] Call Trace: > > > [ 0.000000] [<80008260>] show_stack+0x28/0xf0 > > > [ 0.000000] [<8070c958>] dump_stack_lvl+0x60/0x80 > > > [ 0.000000] [<8002e184>] __warn+0xc4/0xf8 > > > [ 0.000000] [<8002e210>] warn_slowpath_fmt+0x58/0xa4 > > > [ 0.000000] [<801c0fac>] kmem_cache_alloc+0x5a4/0x5e8 > > > [ 0.000000] [<8092856c>] prom_soc_init+0x1fc/0x2b4 > > > [ 0.000000] [<80928060>] prom_init+0x44/0xf0 > > > [ 0.000000] [<80929214>] setup_arch+0x4c/0x6a8 > > > [ 0.000000] [<809257e0>] start_kernel+0x88/0x7c0 > > > [ 0.000000] > > > [ 0.000000] ---[ end trace 0000000000000000 ]--- > > > [ 0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3 > > > [ 0.000000] printk: bootconsole [early0] enabled > > Last version I tested on my gnubee PC1 mt7621 board was v6.0 and all > was booting properly. I have verified with 6.1.0-rc1 system does not boot as you was pointed out here. After adding your patches the system boots and got an Oops because soc_device_match_attr: [ 20.569959] CPU 0 Unable to handle kernel paging request at virtual address 675f6b6c, epc == 80403dec, ra == 804ae11c [ 20.591060] Oops[#1]: [ 20.595462] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc1+ #148 [ 20.608265] $ 0 : 00000000 00000001 82262a00 00000000 [ 20.618615] $ 4 : 675f6b6c 808dea04 00000000 804ae138 [ 20.628983] $ 8 : 00000000 808787ba 00000000 821f4b00 [ 20.639351] $12 : 0000005b 0000005d 0000002d 0000005c [ 20.649735] $16 : 82253580 807b4034 807b4034 804ae138 [ 20.660087] $20 : fffffff4 82c382b8 809e1094 00000008 [ 20.670455] $24 : 0000002a 0000003f [ 20.680823] $28 : 82050000 82051c30 80a0d638 804ae11c [ 20.691190] Hi : 00000037 [ 20.696891] Lo : 5c28f6a0 [ 20.702610] epc : 80403dec glob_match+0x1c/0x240 [ 20.712100] ra : 804ae11c soc_device_match_attr+0xac/0xc8 [ 20.723330] Status: 11000403 KERNEL EXL IE [ 20.731626] Cause : 40800008 (ExcCode 02) [ 20.739576] BadVA : 675f6b6c [ 20.745277] PrId : 0001992f (MIPS 1004Kc) [ 20.753414] Modules linked in: [ 20.759448] Process swapper/0 (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=00000000) [ 20.775520] Stack : fffffff4 80496ab8 820c6010 828c8518 80950000 ffffffea 80950000 80496b48 [ 20.792106] 00000000 828c8400 820c6010 821f4880 1e160000 821bc754 82253734 7f8268e6 [ 20.808707] 809c6a94 807b4034 804ae138 809c8e88 819a0000 804ae1d8 80a0d638 80438e10 [ 20.825282] 821f3e70 80950000 808c0000 828c8400 820c6000 828c8548 820c6010 80456608 [ 20.841879] 821f3dc0 821d32c0 819a0000 801d8768 821f3dc0 821d32c0 828c8540 80950000 [ 20.858473] ... [ 20.863298] Call Trace: [ 20.868137] [<80403dec>] glob_match+0x1c/0x240 [ 20.876955] [<804ae11c>] soc_device_match_attr+0xac/0xc8 [ 20.887500] [<80496b48>] bus_for_each_dev+0x7c/0xc0 [ 20.897176] [<804ae1d8>] soc_device_match+0x98/0xc8 [ 20.906869] [<80456608>] mt7621_pcie_probe+0x90/0x7b8 [ 20.916876] [<8049b46c>] platform_probe+0x54/0x94 [ 20.926206] [<80499058>] really_probe+0x200/0x434 [ 20.935538] [<80499520>] driver_probe_device+0x44/0xd4 [ 20.945732] [<80499ae0>] __driver_attach+0xb8/0x1b0 [ 20.955428] [<80496b48>] bus_for_each_dev+0x7c/0xc0 [ 20.965089] [<80497f18>] bus_add_driver+0x100/0x218 [ 20.974763] [<8049a338>] driver_register+0xd0/0x118 [ 20.984438] [<80001590>] do_one_initcall+0x8c/0x28c [ 20.994115] [<809e21c8>] kernel_init_freeable+0x254/0x28c [ 21.004845] [<80781070>] kernel_init+0x24/0x118 [ 21.013830] [<800034f8>] ret_from_kernel_thread+0x14/0x1c [ 21.024522] [ 21.027457] Code: 240f005c 2418002a 2419003f <80820000> 24a90001 90a70000 104c006f 24860001 2843005c [ 21.046810] [ 21.049830] ---[ end trace 0000000000000000 ]--- [ 21.058935] Kernel panic - not syncing: Fatal exception [ 21.069310] Rebooting in 1 seconds.. I have fixed this adding two sentinels in the following files: drivers/pci/controller/pcie-mt7621.c drivers/phy/ralink/phy-mt7621-pci.c sergio@camaron:~/GNUBEE-SERGIO-TEST/linux$ git diff drivers/pci/controller/pcie-mt7621.c drivers/phy/ralink/phy-mt7621-pci.c diff --git a/drivers/pci/controller/pcie-mt7621.c b/drivers/pci/controller/pcie-mt7621.c index 4bd1abf26008..ee7aad09d627 100644 --- a/drivers/pci/controller/pcie-mt7621.c +++ b/drivers/pci/controller/pcie-mt7621.c @@ -466,7 +466,8 @@ static int mt7621_pcie_register_host(struct pci_host_bridge *host) } static const struct soc_device_attribute mt7621_pcie_quirks_match[] = { - { .soc_id = "mt7621", .revision = "E2" } + { .soc_id = "mt7621", .revision = "E2" }, + { /* sentinel */ } }; static int mt7621_pcie_probe(struct platform_device *pdev) diff --git a/drivers/phy/ralink/phy-mt7621-pci.c b/drivers/phy/ralink/phy-mt7621-pci.c index 5e6530f545b5..85888ab2d307 100644 --- a/drivers/phy/ralink/phy-mt7621-pci.c +++ b/drivers/phy/ralink/phy-mt7621-pci.c @@ -280,7 +280,8 @@ static struct phy *mt7621_pcie_phy_of_xlate(struct device *dev, } static const struct soc_device_attribute mt7621_pci_quirks_match[] = { - { .soc_id = "mt7621", .revision = "E2" } + { .soc_id = "mt7621", .revision = "E2" }, + { /* sentinel */ } }; static const struct regmap_config mt7621_pci_phy_regmap_config = { With this two minor changes and your patches the system properly boots and behaves properly. So FWIW feel free to add my: Tested-by: Sergio Paracuellos <sergio.paracuellos@xxxxxxxxx> Acked-by: Sergio Paracuellos <sergio.paracuellos@xxxxxxxxx> Please, let me know if you want me to send any patches or if you are going to create a complete patchset with all the needed changes. Thank you very much for doing this! Best regards, Sergio Paracuellos [snip]