On Tue, Oct 29, 2024 at 5:24 PM Mike Rapoport <rppt@xxxxxxxxxx> wrote: > > On Tue, Oct 29, 2024 at 04:43:39PM +0100, Jan Stancek wrote: > > On Tue, Oct 29, 2024 at 4:07 PM Zi Yan <ziy@xxxxxxxxxx> wrote: > > > > > > +tegra mailing list and maintainers > > > > > > On 29 Oct 2024, at 8:47, Jan Stancek wrote: > > > > > > > Hi, > > > > > > > > I'm seeing a regression on Nvidia IGX system, which no longer boots. > > > > > > > > bisect points at commit 767507654c22 ("arch_numa: switch over to numa_memblks"). > > > > It hangs very early, with 4k or 64k pages, with no kernel messages printed: > > > > > > > > EFI stub: Booting Linux Kernel... > > > > EFI stub: Using DTB from configuration table > > > > EFI stub: Exiting boot services... > > > > <hangs here> > > > > > > > > > > Is it possible to have earlycon output? It is hard to debug without any > > > information except kernel fails to boot. > > > > I know it was a long shot, so far I haven't had luck getting it to work. > > Does it boot with numa=off and numa=fake? No, it doesn't. > > In the log from successful boot it seems there is no NUMA information in > the device tree, can you send the device tree as well please? https://people.redhat.com/jstancek/aarch64_numa_boot/device_tree Regards, Jan > > > > Since the previous commit boots and I assume both kernels are compiled > > > with the same gcc toolchain, this should not be caused by the binuils > > > bug in 2.42[1]. Is your binutils version 2.42? > > > > Yes, both are compiled locally, with binutils 2.41 > > > > > > > > Thanks. > > > > > > > > > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=31924 > > > > > > > Here's a log from successful boot with previous commit: > > > > https://people.redhat.com/jstancek/aarch64_numa_boot/console-log-good.txt > > > > and config: https://people.redhat.com/jstancek/aarch64_numa_boot/config > > > > > > > > # lscpu > > > > Architecture: aarch64 > > > > CPU op-mode(s): 32-bit, 64-bit > > > > Byte Order: Little Endian > > > > CPU(s): 12 > > > > On-line CPU(s) list: 0-11 > > > > Vendor ID: ARM > > > > BIOS Vendor ID: NVIDIA > > > > Model name: Cortex-A78AE > > > > BIOS Model name: Not Specified Not Specified CPU @ 0.0GHz > > > > BIOS CPU family: 257 > > > > Model: 1 > > > > Thread(s) per core: 1 > > > > Core(s) per cluster: 12 > > > > Socket(s): 1 > > > > Cluster(s): 1 > > > > Stepping: r0p1 > > > > CPU(s) scaling MHz: 100% > > > > CPU max MHz: 1971.2000 > > > > CPU min MHz: 115.2000 > > > > BogoMIPS: 62.50 > > > > Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 > > > > atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcpc > > > > flagm paca pacg > > > > Caches (sum of all): > > > > L1d: 768 KiB (12 instances) > > > > L1i: 768 KiB (12 instances) > > > > L2: 3 MiB (12 instances) > > > > L3: 6 MiB (3 instances) > > > > NUMA: > > > > NUMA node(s): 1 > > > > NUMA node0 CPU(s): 0-11 > > > > Vulnerabilities: > > > > Gather data sampling: Not affected > > > > Itlb multihit: Not affected > > > > L1tf: Not affected > > > > Mds: Not affected > > > > Meltdown: Not affected > > > > Mmio stale data: Not affected > > > > Reg file data sampling: Not affected > > > > Retbleed: Not affected > > > > Spec rstack overflow: Not affected > > > > Spec store bypass: Mitigation; Speculative Store Bypass > > > > disabled via prctl > > > > Spectre v1: Mitigation; __user pointer sanitization > > > > Spectre v2: Mitigation; CSV2, BHB > > > > Srbds: Not affected > > > > Tsx async abort: Not affected > > > > > > > > Regards, > > > > Jan > > > > > > > > > Best Regards, > > > Yan, Zi > > > > > > > -- > Sincerely yours, > Mike. >