Previously without the patch, there was an error observed as follows: crash> irq -a IRQ NAME AFFINITY 0 timer 0-191 4 ttyS0 0-23,96-119 ... 84 smartpqi 72-73,168 irq: page excluded: kernel virtual address: ffff97d03ffff000 type: "irq_desc affinity" The reason is the reading of irq affinity exceeded the memory range, see the following debug info: Thread 1 "crash" hit Breakpoint 1, generic_get_irq_affinity (irq=85) at kernel.c:7373 7375 irq_desc_addr = get_irq_desc_addr(irq); (gdb) p/x irq_desc_addr $1 = 0xffff97d03f21e800 crash> struct irq_desc 0xffff97d03f21e800 struct irq_desc { irq_common_data = { state_use_accessors = 425755136, node = 3, handler_data = 0x0, msi_desc = 0xffff97ca51b83480, affinity = 0xffff97d03fffee60, effective_affinity = 0xffff97d03fffe6c0 }, crash> whatis cpumask_t typedef struct cpumask { unsigned long bits[128]; } cpumask_t; SIZE: 1024 In order to get the affinity, crash will read the memory range 0xffff97d03fffee60 ~ 0xffff97d03fffee60 + 1024(0x400) by line: readmem(affinity_ptr, KVADDR, affinity, len, "irq_desc affinity", FAULT_ON_ERROR); However the reading will exceed the effective memory range: crash> kmem 0xffff97d03fffee60 CACHE OBJSIZE ALLOCATED TOTAL SLABS SSIZE NAME ffff97c900044400 32 123297 162944 1273 4k kmalloc-32 SLAB MEMORY NODE TOTAL ALLOCATED FREE fffffca460ffff80 ffff97d03fffe000 3 128 81 47 FREE / [ALLOCATED] [ffff97d03fffee60] PAGE PHYSICAL MAPPING INDEX CNT FLAGS fffffca460ffff80 83fffe000 dead000000000001 ffff97d03fffe340 1 d7ffffe0000800 slab crash> kmem ffff97d03ffff000 PAGE PHYSICAL MAPPING INDEX CNT FLAGS fffffca460ffffc0 83ffff000 0 0 1 d7ffffe0004000 reserved crash> dmesg ... [ 0.000000] BIOS-e820: [mem 0x00000000fe000000-0x00000000fe00ffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000083fffefff] usable [ 0.000000] BIOS-e820: [mem 0x000000083ffff000-0x000000083fffffff] reserved ... The beginning physical address, aka 0x83fffe000, is located in the usable area and is readable, however the later physical address, starting from 0x83ffff000, is located in reserved region and not readable. In fact, the affinity member is allocated by alloc_cpumask_var_node(), for the 192 CPUs system, the allocated size is only 24, and we can see it is within the kmalloc-32 slab. So it is incorrect to read 1024 length(given by STRUCT_SIZE("cpumask_t")), only 24 is enough. Since there are plenty of places in crash which takes the value of STRUCT_SIZE("cpumask_t"), and works fine for the past, this patch will not modify them all, but only this place which encountered the issue. Signed-off-by: Tao Liu <ltao@xxxxxxxxxx> --- kernel.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/kernel.c b/kernel.c index 8a9d498..464e877 100644 --- a/kernel.c +++ b/kernel.c @@ -7362,7 +7362,7 @@ void generic_get_irq_affinity(int irq) { ulong irq_desc_addr; - long len; + long len, len_cpumask; ulong affinity_ptr; ulong *affinity; ulong tmp_addr; @@ -7382,8 +7382,11 @@ generic_get_irq_affinity(int irq) if (!action) return; - if ((len = STRUCT_SIZE("cpumask_t")) < 0) - len = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) * sizeof(ulong); + len = DIV_ROUND_UP(kt->cpus, BITS_PER_LONG) * sizeof(ulong); + len_cpumask = STRUCT_SIZE("cpumask_t"); + if (len_cpumask > 0) { + len = len_cpumask > len ? len : len_cpumask; + } affinity = (ulong *)GETBUF(len); if (VALID_MEMBER(irq_common_data_affinity)) -- 2.40.1 -- Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki