On 09/12/2022 11:31, Pierre Gondois wrote: > v2: > - Applied renaming/formatting comments from v1. > - Check CACHE_TYPE_VALID flag in pppt.c. > v3: > - Applied Sudeep's suggestions (for patch 5/5): > - Renaming allocate_cache_info() -> fecth_cache_info() > - Updated error message > - Extract an inline allocate_cache_info() function > - Re-run checkpatch with --strict option > > Note: > This patchset requires the following patch to be applied first in > order to avoid the same bug described in the commit message: > https://lore.kernel.org/all/20221116094958.2141072-1-pierre.gondois@xxxxxxx/ > > [1] and [2] build the CPU topology from the cacheinfo information for > both DT/ACPI based systems and remove (struct cpu_topology).llc_id > which was used by ACPI only. > > Creating the cacheinfo for secondary CPUs is done during early boot. > Preemption and interrupts are disabled at this stage. On PREEMPT_RT > kernels, allocating memory (and parsing the PPTT table for ACPI based > systems) triggers a: > 'BUG: sleeping function called from invalid context' [4] > > To prevent this bug, allocate the cacheinfo from the primary CPU when > preemption and interrupts are enabled and before booting secondary > CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information > only, without relying on the arm64 CLIDR_EL1 register. > If no cache information is found in the DT/ACPI PPTT, then fallback > to the current state, triggering [4] on PREEMPT_RT kernels. > > Patches to update the arm64 device trees that have incomplete cacheinfo > (mostly for missing the 'cache-level' or 'cache-unified' property) > have been sent at [3]. > > Tested platforms: > - ACPI + PPTT: Ampere Altra, Ampere eMAG, Cavium ThunderX2, > Kunpeng 920, Juno-r2 > - DT: rb5, db845c, Juno-r2 > I gave the patchset a try with DTS fixes for cache topology on Qualcomm RB5 board (SM8250 SoC) and with KASAN it produces: BUG: KASAN: slab-out-of-bounds in populate_cache_leaves+0x84/0x15c [ 0.633014] dump_backtrace.part.0+0xe0/0xf0 [ 0.633035] show_stack+0x18/0x40 [ 0.633050] dump_stack_lvl+0x8c/0xb8 [ 0.633085] print_report+0x188/0x488 [ 0.633106] kasan_report+0xac/0xf0 [ 0.633136] __asan_store4+0x80/0xa4 [ 0.633158] populate_cache_leaves+0x84/0x15c [ 0.633181] detect_cache_attributes+0xc0/0x8c4 [ 0.633213] update_siblings_masks+0x28/0x43c [ 0.633235] store_cpu_topology+0x98/0xc0 [ 0.633251] smp_prepare_cpus+0x2c/0x15c [ 0.633281] kernel_init_freeable+0x22c/0x424 [ 0.633310] kernel_init+0x24/0x13c [ 0.633328] ret_from_fork+0x10/0x20 [ 0.633388] [ 0.708729] Allocated by task 1: [ 0.712078] kasan_save_stack+0x2c/0x60 [ 0.716066] kasan_set_track+0x2c/0x40 [ 0.719959] kasan_save_alloc_info+0x24/0x3c [ 0.724387] __kasan_kmalloc+0xa0/0xbc [ 0.728278] __kmalloc+0x74/0x110 [ 0.731740] fetch_cache_info+0x170/0x210 [ 0.735902] init_cpu_topology+0x254/0x2bc [ 0.740171] smp_prepare_cpus+0x20/0x15c [ 0.744272] kernel_init_freeable+0x22c/0x424 [ 0.748791] kernel_init+0x24/0x13c [ 0.752420] ret_from_fork+0x10/0x20 Best regards, Krzysztof
[ 0.000000] arch_timer: cp15 and mmio timer(s) running at 19.20MHz (virt/virt). [ 0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x46d987e47, max_idle_ns: 440795202767 ns [ 0.000001] sched_clock: 56 bits at 19MHz, resolution 52ns, wraps every 4398046511078ns [ 0.005051] Console: colour dummy device 80x25 [ 0.478980] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar [ 0.478992] ... MAX_LOCKDEP_SUBCLASSES: 8 [ 0.479002] ... MAX_LOCK_DEPTH: 48 [ 0.479011] ... MAX_LOCKDEP_KEYS: 8192 [ 0.479019] ... CLASSHASH_SIZE: 4096 [ 0.479027] ... MAX_LOCKDEP_ENTRIES: 32768 [ 0.479035] ... MAX_LOCKDEP_CHAINS: 65536 [ 0.479043] ... CHAINHASH_SIZE: 32768 [ 0.479052] memory used by lock dependency info: 6365 kB [ 0.479061] memory used for stack traces: 4224 kB [ 0.479069] per task-struct memory footprint: 1920 bytes [ 0.479976] Calibrating delay loop (skipped), value calculated using timer frequency.. 38.40 BogoMIPS (lpj=19200) [ 0.480007] pid_max: default: 32768 minimum: 301 [ 0.482256] LSM: Security Framework initializing [ 0.484629] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear) [ 0.484692] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear) [ 0.518295] ================================================================== [ 0.617001] BUG: KASAN: slab-out-of-bounds in populate_cache_leaves+0x84/0x15c [ 0.624489] Write of size 4 at addr ffff0d83400366c8 by task swapper/0/1 [ 0.631400] [ 0.632973] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rt5-00372-ga6339d0b4e8e #45 [ 0.632995] Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) [ 0.633006] Call trace: [ 0.633014] dump_backtrace.part.0+0xe0/0xf0 [ 0.633035] show_stack+0x18/0x40 [ 0.633050] dump_stack_lvl+0x8c/0xb8 [ 0.633085] print_report+0x188/0x488 [ 0.633106] kasan_report+0xac/0xf0 [ 0.633136] __asan_store4+0x80/0xa4 [ 0.633158] populate_cache_leaves+0x84/0x15c [ 0.633181] detect_cache_attributes+0xc0/0x8c4 [ 0.633213] update_siblings_masks+0x28/0x43c [ 0.633235] store_cpu_topology+0x98/0xc0 [ 0.633251] smp_prepare_cpus+0x2c/0x15c [ 0.633281] kernel_init_freeable+0x22c/0x424 [ 0.633310] kernel_init+0x24/0x13c [ 0.633328] ret_from_fork+0x10/0x20 [ 0.633388] [ 0.708729] Allocated by task 1: [ 0.712078] kasan_save_stack+0x2c/0x60 [ 0.716066] kasan_set_track+0x2c/0x40 [ 0.719959] kasan_save_alloc_info+0x24/0x3c [ 0.724387] __kasan_kmalloc+0xa0/0xbc [ 0.728278] __kmalloc+0x74/0x110 [ 0.731740] fetch_cache_info+0x170/0x210 [ 0.735902] init_cpu_topology+0x254/0x2bc [ 0.740171] smp_prepare_cpus+0x20/0x15c [ 0.744272] kernel_init_freeable+0x22c/0x424 [ 0.748791] kernel_init+0x24/0x13c [ 0.752420] ret_from_fork+0x10/0x20 [ 0.756131] [ 0.757726] The buggy address belongs to the object at ffff0d8340036600 [ 0.757726] which belongs to the cache kmalloc-256 of size 256 [ 0.770607] The buggy address is located 200 bytes inside of [ 0.770607] 256-byte region [ffff0d8340036600, ffff0d8340036700) [ 0.782690] [ 0.784256] The buggy address belongs to the physical page: [ 0.790008] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100034 [ 0.799686] head:(____ptrval____) order:2 compound_mapcount:0 compound_pincount:0 [ 0.807405] flags: 0x800000000010200(slab|head|node=0|zone=2) [ 0.813365] raw: 0800000000010200 0000000000000000 dead000000000122 ffff0d8340002480 [ 0.821349] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 0.829364] page dumped because: kasan: bad access detected [ 0.835117] [ 0.836679] Memory state around the buggy address: [ 0.841639] ffff0d8340036580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 0.849085] ffff0d8340036600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 0.856562] >ffff0d8340036680: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc [ 0.864005] ^ [ 0.869760] ffff0d8340036700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 0.877206] ffff0d8340036780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 0.884650] ================================================================== [ 0.892100] Disabling lock debugging due to kernel taint [ 0.904200] cblist_init_generic: Setting adjustable number of callback queues. [ 0.904241] cblist_init_generic: Setting shift to 3 and lim to 1. [ 0.918718] cblist_init_generic: Setting shift to 3 and lim to 1. [ 0.925721] Running RCU-tasks wait API self tests [ 1.036108] rcu: Hierarchical SRCU implementation. [ 1.036118] rcu: Max phase no-delay instances is 400. [ 1.036756] printk: bootconsole [qcom_geni0] printing thread started [ 1.048808] Callback from call_rcu_tasks_trace() invoked. [ 1.081711] EFI services will not be available. [ 1.091529] smp: Bringing up secondary CPUs ... [ 1.103585] Detected VIPT I-cache on CPU1 [ 1.103747] GICv3: CPU1: found redistributor 100 region 0:0x0000000017a80000 [ 1.103830] CPU1: Booted secondary processor 0x0000000100 [0x51df805e] [ 1.129767] Detected VIPT I-cache on CPU2 [ 1.129911] GICv3: CPU2: found redistributor 200 region 0:0x0000000017aa0000 [ 1.129979] CPU2: Booted secondary processor 0x0000000200 [0x51df805e] [ 1.155742] Detected VIPT I-cache on CPU3 [ 1.155876] GICv3: CPU3: found redistributor 300 region 0:0x0000000017ac0000 [ 1.155936] CPU3: Booted secondary processor 0x0000000300 [0x51df805e] [ 1.182791] CPU features: detected: Spectre-v4 [ 1.182824] CPU features: detected: Spectre-BHB [ 1.182854] CPU features: detected: ARM erratum 1508412 (kernel portion) [ 1.182900] Detected PIPT I-cache on CPU4 [ 1.183187] GICv3: CPU4: found redistributor 400 region 0:0x0000000017ae0000 [ 1.183294] CPU4: Booted secondary processor 0x0000000400 [0x411fd0d0] [ 1.225640] Detected PIPT I-cache on CPU5 [ 1.226000] GICv3: CPU5: found redistributor 500 region 0:0x0000000017b00000 [ 1.226099] CPU5: Booted secondary processor 0x0000000500 [0x411fd0d0] [ 1.252358] Detected PIPT I-cache on CPU6 [ 1.252722] GICv3: CPU6: found redistributor 600 region 0:0x0000000017b20000 [ 1.252821] CPU6: Booted secondary processor 0x0000000600 [0x411fd0d0] [ 1.266024] Callback from call_rcu_tasks() invoked. [ 1.284303] Detected PIPT I-cache on CPU7 [ 1.284475] GICv3: CPU7: found redistributor 700 region 0:0x0000000017b40000 [ 1.284525] CPU7: Booted secondary processor 0x0000000700 [0x411fd0d0] [ 1.284926] smp: Brought up 1 node, 8 CPUs [ 1.284943] SMP: Total of 8 processors activated. [ 1.284954] CPU features: detected: 32-bit EL0 Support [ 1.284963] CPU features: detected: Data cache clean to the PoU not required for I/D coherence [ 1.284975] CPU features: detected: Common not Private translations [ 1.284985] CPU features: detected: CRC32 instructions [ 1.284999] CPU features: detected: RCpc load-acquire (LDAPR) [ 1.285009] CPU features: detected: LSE atomic instructions [ 1.285019] CPU features: detected: Privileged Access Never [ 1.285029] CPU features: detected: RAS Extension Support [ 1.285044] CPU features: detected: Speculative Store Bypassing Safe (SSBS) [ 1.296826] CPU: All CPU(s) started at EL1 [ 1.296891] alternatives: applying system-wide alternatives [ 1.315641] devtmpfs: initialized [ 1.670123] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns [ 1.670327] futex hash table entries: 2048 (order: 6, 393216 bytes, linear) [ 1.673537] pinctrl core: initialized pinctrl subsystem [ 1.684021] DMI not present or invalid. [ 1.687165] NET: Registered PF_NETLINK/PF_ROUTE protocol family [ 1.695595] DMA: preallocated 1024 KiB GFP_KERNEL pool for atomic allocations [ 1.696373] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations [ 1.698379] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations [ 1.698742] audit: initializing netlink subsys (disabled) [ 1.699628] audit: type=2000 audit(1.628:1): state=initialized audit_enabled=0 res=1 [ 1.708543] thermal_sys: Registered thermal governor 'step_wise' [ 1.708570] thermal_sys: Registered thermal governor 'power_allocator' [ 1.708961] cpuidle: using governor ladder [ 1.709035] cpuidle: using governor menu [ 1.709849] NET: Registered PF_QIPCRTR protocol family [ 1.711345] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers. [ 1.719297] ASID allocator initialised with 32768 entries [ 1.733197] Serial: AMBA PL011 UART driver [ 1.855339] platform 1d87000.phy: Fixing up cyclic dependency with 1d84000.ufshc [ 2.041279] KASLR enabled [ 2.174612] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages [ 2.174626] HugeTLB: 16380 KiB vmemmap can be freed for a 1.00 GiB page [ 2.174634] HugeTLB: registered 32.0 MiB page size, pre-allocated 0 pages [ 2.174640] HugeTLB: 508 KiB vmemmap can be freed for a 32.0 MiB page [ 2.174647] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages [ 2.174652] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page [ 2.174660] HugeTLB: registered 64.0 KiB page size, pre-allocated 0 pages [ 2.174665] HugeTLB: 0 KiB vmemmap can be freed for a 64.0 KiB page [ 2.188583] ACPI: Interpreter disabled. [ 2.306645] iommu: Default domain type: Translated [ 2.306660] iommu: DMA domain TLB invalidation policy: strict mode [ 2.310343] SCSI subsystem initialized [ 2.313611] usbcore: registered new interface driver usbfs [ 2.313987] usbcore: registered new interface driver hub [ 2.314314] usbcore: registered new device driver usb [ 2.321138] pps_core: LinuxPPS API ver. 1 registered [ 2.321144] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@xxxxxxxx> [ 2.321246] PTP clock support registered [ 2.321452] EDAC MC: Ver: 3.0.0 [ 2.324853] CPUidle PSCI: Initialized CPU PM domain topology [ 2.327564] qcom_scm: convention: smc arm 64 [ 2.335862] FPGA manager framework