On Fri, 18 Sep 2009 08:38:02 -0700 (PDT) Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > On Fri, 18 Sep 2009, Ingo Molnar wrote: > > > > [ 158.058140] warning: `dbus-daemon' uses 32-bit capabilities > > (legacy support in use) [ 159.370562] BUG: unable to handle kernel > > NULL pointer dereference at (null) [ 159.372694] IP: > > [<ffffffff8143b722>] bitmap_scnprintf+0x72/0xd0 > > Hmm. The code is > > a: 49 63 fc movslq %r12d,%rdi > d: 0f 49 d3 cmovns %ebx,%edx > 10: c1 f8 1f sar $0x1f,%eax > 13: 4c 01 ff add %r15,%rdi > 16: c1 e8 1a shr $0x1a,%eax > 19: c1 fa 06 sar $0x6,%edx > 1c: 41 c1 e8 02 shr $0x2,%r8d > 20: 8d 0c 03 lea (%rbx,%rax,1),%ecx > 23: 48 63 d2 movslq %edx,%rdx > 26: 83 e1 3f and $0x3f,%ecx > 29: 29 c1 sub %eax,%ecx > 2b:* 49 8b 44 d5 00 mov > 0x0(%r13,%rdx,8),%rax <-- trapping instruction 30: 48 c7 > c2 8c 37 16 82 mov $0xffffffff8216378c,%rdx 37: 48 > d3 e8 shr %cl,%rax 3a: 89 > f1 mov %esi,%ecx 3c: 44 89 > f6 mov %r14d,%esi > > and the obvious reason seems to be that 'maskp' is NULL (that > faulting thing is the code for "val = (maskp[word] >> bit) & > chunkmask;" with the actual fault being the access of "maskp[word]". > > Now, the caller does > > mask = cpumask_of_pcibus(to_pci_dev(dev)->bus); > > and then uses cpumask_scnprintf() that is just a wrapper that does > > bitmap_scnprintf(buf, len, cpumask_bits(srcp), > nr_cpumask_bits); > > So clearly we have "cpumask_of_pcibus()" being NULL (cpumask_bits() > would not change it). > > I assume this is the NUMA case? The non-NUMA case has just > > static inline const struct cpumask *cpumask_of_node(int node) > { > return cpu_online_mask; > } > > so I don't think you can ever get NULL (if we have a NULL > cpu_online_mask we have bigger problems). > > [ Side note: looking closer, I think our headers are buggy, and I > _know_ they are confusing. The above inline declaration of > cpumask_of_node() seems to be then later overridden in > <asm-generic/topology.h> by a #define! > > And if I read that right, that will also override the debugging > versions that we declared if CONFIG_DEBUG_PER_CPU_MAPS is on. Ingo? > Rusty? Am I missing something? > > That said, those overrides should only happen for non-NUMA ] > > The NUMA version of 'cpumask_of_node()' has all the debug code for > show it's not returning NULL, but only when CONFIG_DEBUG_PER_CPU_MAPS > is enabled. Otherwise it all seems to boil down to (through > cpumask_of_pcibus and __pcibus_to_node): > > node_to_cpumask_map[bus->sysdata->node] > > and it can fail either because "node" isn't initialized, or > node_to_cpumask_map[] isn't. > > Probably 'node' is still -1, and it gets the NULL by going off the > array into la-la-land. Yeah David posted a fix for this diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h --- a/arch/x86/include/asm/pci.h +++ b/arch/x86/include/asm/pci.h @@ -143,7 +143,11 @@ static inline int __pcibus_to_node(const struct pci_bus *bus) static inline const struct cpumask * cpumask_of_pcibus(const struct pci_bus *bus) { - return cpumask_of_node(__pcibus_to_node(bus)); + int node; + + node = __pcibus_to_node(bus); + return (node == -1) ? cpu_online_mask : + cpumask_of_node(node); } #endif Looks like it should fix this issue. He's also right that we should probably have a NUMA_NO_NODE define for this case... I'll pick it up and put it in my tree. Jesse -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html