On 2024/6/11 4:27, Paul Menzel wrote: > Dear Bjorn, > > > Am 10.06.24 um 21:42 schrieb Bjorn Helgaas: >> [+cc Yunsheng, thread at >> https://lore.kernel.org/r/a154f694-c48b-4b3b-809f-4b74ec86a924@xxxxxxxxxxxxx] Thanks for cc'ing. >> >> Thanks very much for this report! > > Thank you for the quick reply. > >> On Sun, Jun 09, 2024 at 10:31:05AM +0200, Paul Menzel wrote: >>> On the servers below Linux warns: >>> >>> Unknown NUMA node; performance will be reduced >> >> This warning was added by ad5086108b9f ("PCI: Warn if no host bridge >> NUMA node info"), which appeared in v5.5, so I assume this isn't new. >> >> That commit log says: >> >> In pci_call_probe(), we try to run driver probe functions on the node where >> the device is attached. If we don't know which node the device is attached >> to, the driver will likely run on the wrong node. This will still work, >> but performance will not be as good as it could be. >> >> On NUMA systems, warn if we don't know which node a PCI host bridge is >> attached to. This is likely an indication that ACPI didn't supply a _PXM >> method or the DT didn't supply a "numa-node-id" property. >> >> I assume these are all ACPI systems, so likely missing _PXM. An >> acpidump could confirm this. > > I created an issue in the Linux Kernel Bugzilla [1] and attached the output of `acpidump` on a Dell PowerEdge T630 there. The DSDT contains: > > Device (PCI1) > { > […] > Method (_PXM, 0, NotSerialized) // _PXM: Device Proximity > { > If ((CLOD == 0x00)) > { > Return (0x01) > } > Else > { > Return (0x02) > } > } > […] > } > >> I think the devices on buses 7f and ff are Intel chipset devices, and >> I doubt we have drivers for any of them. They have vendor/device IDs >> of 8086:6fXX, and I didn't see any reference to them: >> >> $ git grep -i \<0x6f..\> >> $ > > Interesting. Any ideas, what these chipset devices do? > >> If we *did* have drivers, they would certainly benefit from having >> _PXM, but since there are no probe methods, I don't think it matters >> that we don't know where they should run. >> >> Maybe the message should be downgraded from "dev_warn" to "dev_info" >> since there's no functional problem, and the user can't really do >> anything about it. >> >> We could also consider moving it to the actual probe path, so we don't >> emit a message unless there is an affected driver. The problem seems to be how we decide if there is an affected driver? do we care about the out-of-tree driver? doesn't the out-of-tree driver suffer from the similar problem if BIOS is not providing the correct numa info? The 'Unknown NUMA node; performance will be reduced' warning seems to be added to give the vendor some pressure to fix the BIOS as fast as possible, downgrading from "dev_warn" to "dev_info" or moving it to the actual probe path does not seems to fix the problem, just alliviate the pressure for vendor to fix the BIOS? > > Both ideas sound good, but I do not know the code at all. > >>> 1. [ 0.000000] DMI: Dell Inc. PowerEdge R730/0H21J3, BIOS 2.13.0 05/14/2021 >>> 2. [ 0.000000] DMI: Dell Inc. PowerEdge R730/0H21J3, BIOS 2.2.5 09/06/2016 >>> 3. [ 0.000000] DMI: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS 2.3.4 11/08/2016 >>> 4. [ 0.000000] DMI: Dell Inc. PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013 >>> 5. [ 0.000000] DMI: Dell Inc. PowerEdge R930/0T55KM, BIOS 2.8.1 01/02/2020 >>> 6. [ 0.000000] DMI: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.5.4 08/17/2017 >>> 7. [ 0.000000] DMI: Dell Inc. PowerEdge T630/0W9WXC, BIOS 1.5.4 10/04/2015 >>> 8. [ 0.000000] DMI: Dell Inc. PowerEdge T630/0W9WXC, BIOS 2.11.0 12/23/2019 >>> 9. [ 0.000000] DMI: Dell Inc. PowerEdge T630/0W9WXC, BIOS 2.1.5 04/13/2016 >>> 10. [ 0.000000] DMI: Supermicro Super Server/X13SAE, BIOS 2.0 10/17/2022 >>> ... >> >>> 7f:08.0 System peripheral [0880]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f80] (rev 01) >>> 7f:08.2 Performance counters [1101]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f32] (rev 01) >>> ... >> >>> ff:08.0 System peripheral [0880]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f80] (rev 01) >>> ff:08.2 Performance counters [1101]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f32] (rev 01) >>> ... >> >> >>> [ 0.000000] DMI: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.4.2 01/09/2017 >>> ... >>> [ 4.398627] ACPI: PCI Root Bridge [UNC1] (domain 0000 [bus ff]) >>> [ 4.437865] pci_bus 0000:ff: Unknown NUMA node; performance will be reduced >>> ... >>> [ 4.901021] ACPI: PCI Root Bridge [UNC0] (domain 0000 [bus 7f]) >>> [ 4.940865] pci_bus 0000:7f: Unknown NUMA node; performance will be reduced > > > Kind regards, > > Paul > > > [1]: https://bugzilla.kernel.org/show_bug.cgi?id=218951 > . >