RE: [Patch v2] PCI: hv: Fix NUMA node assignment when kernel boots with custom NUMA topology

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Subject: RE: [Patch v2] PCI: hv: Fix NUMA node assignment when kernel boots
> with custom NUMA topology
> 
> From: longli@xxxxxxxxxxxxxxxxx <longli@xxxxxxxxxxxxxxxxx> Sent: Tuesday,
> January 18, 2022 6:45 PM
> >
> > When kernel boots with a NUMA topology with some NUMA nodes offline,
> > the PCI driver should only set an online NUMA node on the device. This
> > can happen during KDUMP where some NUMA nodes are not made online by
> the KDUMP kernel.
> >
> > This patch also fixes the case where kernel is booting with "numa=off".
> >
> > Signed-off-by: Long Li <longli@xxxxxxxxxxxxx>
> 
> It seems like adding a "Fixes:" tag would be appropriate.

Okay, sending v3 with "Fixes:".

> 
> >
> > Change from v1:
> > Use numa_map_to_online_node() to assign a node to device (suggested by
> > Michael Kelly <mikelley@xxxxxxxxxxxxx>)
> >
> > ---
> >  drivers/pci/controller/pci-hyperv.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
> > index 6c9efeefae1b..c7519add6f13 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -2130,7 +2130,15 @@ static void hv_pci_assign_numa_node(struct
> > hv_pcibus_device *hbus)
> >  			continue;
> >
> >  		if (hv_dev->desc.flags &
> HV_PCI_DEVICE_FLAG_NUMA_AFFINITY)
> > -			set_dev_node(&dev->dev, hv_dev-
> >desc.virtual_numa_node);
> > +			/*
> > +			 * The kernel may boot with some NUMA nodes offline
> > +			 * (e.g. in a KDUMP kernel) or with NUMA disabled via
> > +			 * "numa=off". In those cases, adjust the host provided
> > +			 * NUMA node to a valid NUMA node used by the kernel.
> > +			 */
> > +			set_dev_node(&dev->dev,
> > +				     numa_map_to_online_node(
> > +					     hv_dev->desc.virtual_numa_node));
> 
> Double-check me, but I think this approach has a flaw in that
> numa_map_to_online_node() doesn't check the input node for being out-of-
> range.  The call to node_online() uses the input node to index into the
> node_states bitmap (of type nodemask_t), and could go off the end of the
> bitmap if the input node isn't validated to be less than MAX_NUMNODES (or
> nr_node_ids).

Indeed, your concern is correct when "numa=off" is set.

> 
> Michael
> 
> >
> >  		put_pcichild(hv_dev);
> >  	}
> > --
> > 2.25.1





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux