Re: [PATCH next V1 3/7] PCI: Calculate BW limitation for PCI devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 27, 2018 at 02:17:28PM +0200, Tal Gilboa wrote:
> On 2/22/2018 10:21 PM, Bjorn Helgaas wrote:
> > On Mon, Jan 15, 2018 at 05:35:37PM +0200, Tal Gilboa wrote:
> > > pcie_get_link_limits() function, which is based on
> > > pcie_get_minimum_link(), iterates over the PCI chain
> > > and calculates maximum BW in addition to minimum speed and
> > > width. The BW calculation at each level is speed * width, so
> > > even if, for instance, a level has lower width, than the device max
> > > capabilities, it still might not cause a BW limitation if it has a
> > > higher speed. pcie_get_minimum_link() is kept for compatibility.
> > > 
> > > Signed-off-by: Tal Gilboa <talgi@xxxxxxxxxxxx>
> > > Reviewed-by: Tariq Toukan <tariqt@xxxxxxxxxxxx>
> > > ---
> > >   drivers/pci/pci.c   | 31 ++++++++++++++++++++++++++++---
> > >   include/linux/pci.h |  8 ++++++++
> > >   2 files changed, 36 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index cf0401d..f0c60dc 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -5012,16 +5012,37 @@ int pcie_set_mps(struct pci_dev *dev, int mps)
> > >    * @speed: storage for minimum speed
> > >    * @width: storage for minimum width
> > >    *
> > > - * This function will walk up the PCI device chain and determine the minimum
> > > - * link width and speed of the device.
> > > + * This function use pcie_get_link_limits() for determining the minimum
> > > + * link width and speed of the device. Legacy code is kept for compatibility.
> > >    */
> > >   int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
> > >   			  enum pcie_link_width *width)
> > >   {
> > > +	int bw;
> > > +
> > > +	return pcie_get_link_limits(dev, speed, width, &bw);
> > > +}
> > > +EXPORT_SYMBOL(pcie_get_minimum_link);
> > > +
> > > +/**
> > > + * pcie_get_link_limits - determine minimum link settings of a PCI
> > > +			  device and its BW limitation
> > > + * @dev: PCI device to query
> > > + * @speed: storage for minimum speed
> > > + * @width: storage for minimum width
> > > + * @bw: storage for BW limitation
> > > + *
> > > + * This function walks up the PCI device chain and determines the link
> > > + * limitations - minimum width, minimum speed and max possible BW of the device.
> > 
> > I don't really like this interface because it mixes two separate
> > things: (1) the speed/width capabilities of the current device, and
> > (2) the minimum speed/width of the path leading to the device (and
> > these minimums might even be at different points in the path).
> > 
> > I think what we want is a way to get the max bandwidth a device can
> > use, i.e., pcie_get_speed_cap() * pcie_get_width_cap().
> > 
> > If we also had a way to find the minimum bandwidth point leading to a
> > device, e.g., a pcie_get_minimum_bandwidth() that returned a pci_dev
> > and its bandwidth (and maybe the link speed and width corresponding to
> > that device), then you could print a meaningful message like "this
> > device is capable of X, but upstream device Y limits us to Z".
> > 
> > I think the current pcie_print_link_status() is a little misleading
> > because it prints the minimum link speed and width from the path, but
> > those might be from different devices, which doesn't really make
> > sense.
> 
> I see. We started out with simply checking speed and width only for the
> device. If the chain bandwidth check is misleading I'll revert back to the
> original behavior. No need to calculate bandwidth IMO. Checking current
> speed and width and comparing to max capabilities should be enough.

Let's think about where we want to end up.  I think you want to emit a
message if either the current link width or speed is lower than the
device is capable of.  You don't need the bandwidth for that.

Do you also want to emit a message if there's a bottleneck upstream
from the device?  E.g., the link to the device may be x8, 5GT/s, which
is capable of a bandwidth of 40Gb/s, but if there's an upstream link
that is x4, 5GT/s, the device will be limited to a bandwidth of
20Gb/s.  If you want to check for that, I think we do need to figure
the bandwidth.  If we want a message in situations like this, but we
didn't figure the bandwidth, we would erroneously complain if the
upstream link were x16, 2.5GT/s, where that link has a lower speed but
is wider so it's still capable of 40Gb/s.

If you want a message about upstream bottlenecks, do you want to
identify the specific device that is the bottleneck?

IIRC, you're adding these:

  int pcie_get_width_cap(struct pci_dev *dev, enum pcie_link_width *width)
  int pcie_get_speed_cap(struct pci_dev *dev, enum pci_bus_speed *speed)
  int pcie_get_link_limits(struct pci_dev *dev, enum pci_bus_speed *speed,
			   enum pcie_link_width *width, int *bw);

It seems like we might not need all three of these, since the last one
looks like it could subsume the first two.

> > > +int pcie_get_link_limits(struct pci_dev *dev, enum pci_bus_speed *speed,
> > > +			 enum pcie_link_width *width, int *bw)
> > > +{
> > >   	int ret;
> > >   	*speed = PCI_SPEED_UNKNOWN;
> > >   	*width = PCIE_LNK_WIDTH_UNKNOWN;
> > > +	*bw = 0;
> > >   	while (dev) {
> > >   		u16 lnksta;
> > > @@ -5042,12 +5063,16 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
> > >   		if (next_width < *width)
> > >   			*width = next_width;
> > > +		/* Check if current level in the chain limits the total BW */
> > > +		if (!(*bw) || *bw > (next_width) * PCIE_SPEED2MBS(next_speed))
> > > +			*bw = (next_width) * PCIE_SPEED2MBS(next_speed);
> > > +
> > >   		dev = dev->bus->self;
> > >   	}
> > >   	return 0;
> > >   }
> > > -EXPORT_SYMBOL(pcie_get_minimum_link);
> > > +EXPORT_SYMBOL(pcie_get_link_limits);
> > >   /**
> > >    * pcie_get_speed_cap - queries for the PCI device's link speed capability
> > > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > > index 27dc070..88e23eb 100644
> > > --- a/include/linux/pci.h
> > > +++ b/include/linux/pci.h
> > > @@ -265,6 +265,12 @@ enum pci_bus_speed {
> > >   	 (speed) == PCIE_SPEED_2_5GT ? "2.5 GT/s" : \
> > >   	 "Unknown speed")
> > > +#define PCIE_SPEED2MBS(speed) \
> > > +	((speed) == PCIE_SPEED_8_0GT ? 8000 : \
> > > +	 (speed) == PCIE_SPEED_5_0GT ? 5000 : \
> > > +	 (speed) == PCIE_SPEED_2_5GT ? 2500 : \
> > > +	 0)
> > > +
> > >   struct pci_cap_saved_data {
> > >   	u16		cap_nr;
> > >   	bool		cap_extended;
> > > @@ -1088,6 +1094,8 @@ int pcie_get_minimum_link(struct pci_dev *dev, enum pci_bus_speed *speed,
> > >   			  enum pcie_link_width *width);
> > >   int pcie_get_speed_cap(struct pci_dev *dev, enum pci_bus_speed *speed);
> > >   int pcie_get_width_cap(struct pci_dev *dev, enum pcie_link_width *width);
> > > +int pcie_get_link_limits(struct pci_dev *dev, enum pci_bus_speed *speed,
> > > +			 enum pcie_link_width *width, int *bw);
> > >   void pcie_flr(struct pci_dev *dev);
> > >   int __pci_reset_function_locked(struct pci_dev *dev);
> > >   int pci_reset_function(struct pci_dev *dev);
> > > -- 
> > > 1.8.3.1
> > > 



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux