Re: Linux warns `Unknown NUMA node; performance will be reduced`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Bjorn,


Am 10.06.24 um 21:42 schrieb Bjorn Helgaas:
[+cc Yunsheng, thread at
https://lore.kernel.org/r/a154f694-c48b-4b3b-809f-4b74ec86a924@xxxxxxxxxxxxx]

Thanks very much for this report!

Thank you for the quick reply.

On Sun, Jun 09, 2024 at 10:31:05AM +0200, Paul Menzel wrote:
On the servers below Linux warns:

      Unknown NUMA node; performance will be reduced

This warning was added by ad5086108b9f ("PCI: Warn if no host bridge
NUMA node info"), which appeared in v5.5, so I assume this isn't new.

That commit log says:

   In pci_call_probe(), we try to run driver probe functions on the node where
   the device is attached.  If we don't know which node the device is attached
   to, the driver will likely run on the wrong node.  This will still work,
   but performance will not be as good as it could be.

   On NUMA systems, warn if we don't know which node a PCI host bridge is
   attached to.  This is likely an indication that ACPI didn't supply a _PXM
   method or the DT didn't supply a "numa-node-id" property.

I assume these are all ACPI systems, so likely missing _PXM.  An
acpidump could confirm this.

I created an issue in the Linux Kernel Bugzilla [1] and attached the output of `acpidump` on a Dell PowerEdge T630 there. The DSDT contains:

        Device (PCI1)
        {
        […]
            Method (_PXM, 0, NotSerialized)  // _PXM: Device Proximity
            {
                If ((CLOD == 0x00))
                {
                    Return (0x01)
                }
                Else
                {
                    Return (0x02)
                }
            }
        […]
        }

I think the devices on buses 7f and ff are Intel chipset devices, and
I doubt we have drivers for any of them.  They have vendor/device IDs
of 8086:6fXX, and I didn't see any reference to them:

   $ git grep -i \<0x6f..\>
   $

Interesting. Any ideas, what these chipset devices do?

If we *did* have drivers, they would certainly benefit from having
_PXM, but since there are no probe methods, I don't think it matters
that we don't know where they should run.

Maybe the message should be downgraded from "dev_warn" to "dev_info"
since there's no functional problem, and the user can't really do
anything about it.

We could also consider moving it to the actual probe path, so we don't
emit a message unless there is an affected driver.

Both ideas sound good, but I do not know the code at all.

1.  [    0.000000] DMI: Dell Inc. PowerEdge R730/0H21J3, BIOS 2.13.0 05/14/2021
2.  [    0.000000] DMI: Dell Inc. PowerEdge R730/0H21J3, BIOS 2.2.5 09/06/2016
3.  [    0.000000] DMI: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS 2.3.4 11/08/2016
4.  [    0.000000] DMI: Dell Inc. PowerEdge R910/0KYD3D, BIOS 2.10.0 08/29/2013
5.  [    0.000000] DMI: Dell Inc. PowerEdge R930/0T55KM, BIOS 2.8.1 01/02/2020
6.  [    0.000000] DMI: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.5.4 08/17/2017
7.  [    0.000000] DMI: Dell Inc. PowerEdge T630/0W9WXC, BIOS 1.5.4 10/04/2015
8.  [    0.000000] DMI: Dell Inc. PowerEdge T630/0W9WXC, BIOS 2.11.0 12/23/2019
9.  [    0.000000] DMI: Dell Inc. PowerEdge T630/0W9WXC, BIOS 2.1.5 04/13/2016
10. [    0.000000] DMI: Supermicro Super Server/X13SAE, BIOS 2.0 10/17/2022
...

7f:08.0 System peripheral [0880]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f80] (rev 01)
7f:08.2 Performance counters [1101]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f32] (rev 01)
...

ff:08.0 System peripheral [0880]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f80] (rev 01)
ff:08.2 Performance counters [1101]: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D QPI Link 0 [8086:6f32] (rev 01)
...


[    0.000000] DMI: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.4.2 01/09/2017
...
[    4.398627] ACPI: PCI Root Bridge [UNC1] (domain 0000 [bus ff])
[    4.437865] pci_bus 0000:ff: Unknown NUMA node; performance will be reduced
...
[    4.901021] ACPI: PCI Root Bridge [UNC0] (domain 0000 [bus 7f])
[    4.940865] pci_bus 0000:7f: Unknown NUMA node; performance will be reduced


Kind regards,

Paul


[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218951




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux