Re: [PATCH v2] pci: Store more data about VFs into the SRIOV struct

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2018-03-01 at 13:34 -0600, Bjorn Helgaas wrote:
> s|pci: Store|PCI/IOV: Store|
> 
> (run "git log --oneline drivers/pci/probe.c" to see why)
> 
> On Thu, Mar 01, 2018 at 02:26:04PM +0100, KarimAllah Ahmed wrote:
> > 
> > ... to avoid reading them from the config space of all the PCI VFs. This is
> > specially a useful optimization when bringing up thousands of VFs.
> 
> Please make the changelog complete in itself, so it doesn't have to be
> read in conjunction with the subject.  It's OK if you have to repeat
> the subject in the changelog.

ack.

> 
> > 
> > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > Cc: linux-pci@xxxxxxxxxxxxxxx
> > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > Signed-off-by: KarimAllah Ahmed <karahmed@xxxxxxxxx>
> > ---
> > v1 -> v2:
> > * Rebase on latest + remove dependency on a non-upstream patch.
> > 
> >  drivers/pci/iov.c   | 16 ++++++++++++++++
> >  drivers/pci/pci.h   |  5 +++++
> >  drivers/pci/probe.c | 42 ++++++++++++++++++++++++++++++++----------
> >  3 files changed, 53 insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> > index 677924a..e1d2e3f 100644
> > --- a/drivers/pci/iov.c
> > +++ b/drivers/pci/iov.c
> > @@ -114,6 +114,19 @@ resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> >  	return dev->sriov->barsz[resno - PCI_IOV_RESOURCES];
> >  }
> >  
> > +static void pci_read_vf_config_common(struct pci_bus *bus, struct pci_dev *dev)
> > +{
> > +	int devfn = pci_iov_virtfn_devfn(dev, 0);
> > +
> > +	pci_bus_read_config_dword(bus, devfn, PCI_CLASS_REVISION,
> > +				  &dev->sriov->class);
> > +	pci_bus_read_config_word(bus, devfn, PCI_SUBSYSTEM_ID,
> > +				 &dev->sriov->subsystem_device);
> > +	pci_bus_read_config_word(bus, devfn, PCI_SUBSYSTEM_VENDOR_ID,
> > +				 &dev->sriov->subsystem_vendor);
> > +	pci_bus_read_config_byte(bus, devfn, PCI_HEADER_TYPE, &dev->sriov->hdr_type);
> 
> Can't you do this a little later, e.g., after pci_iov_add_virtfn()
> calls pci_setup_device(), and then use the standard
> pci_read_config_*() interfaces instead of the special
> pci_bus_read_config*() ones?

ack.

I moved it after "pci_iov_virtfn_devfn".

> 
> > 
> > +}
> > +
> >  int pci_iov_add_virtfn(struct pci_dev *dev, int id)
> >  {
> >  	int i;
> > @@ -133,6 +146,9 @@ int pci_iov_add_virtfn(struct pci_dev *dev, int id)
> >  	if (!virtfn)
> >  		goto failed0;
> >  
> > +	if (id == 0)
> > +		pci_read_vf_config_common(bus, dev);
> > +
> >  	virtfn->devfn = pci_iov_virtfn_devfn(dev, id);
> >  	virtfn->vendor = dev->vendor;
> >  	virtfn->device = iov->vf_device;
> > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > index fcd8191..346daa5 100644
> > --- a/drivers/pci/pci.h
> > +++ b/drivers/pci/pci.h
> > @@ -271,6 +271,11 @@ struct pci_sriov {
> >  	u16		driver_max_VFs;	/* Max num VFs driver supports */
> >  	struct pci_dev	*dev;		/* Lowest numbered PF */
> >  	struct pci_dev	*self;		/* This PF */
> > +	u8 hdr_type;		/* VF header type */
> > +	u32 class;		/* VF device */
> > +	u16 device;		/* VF device */
> > +	u16 subsystem_vendor;	/* VF subsystem vendor */
> > +	u16 subsystem_device;	/* VF subsystem device */
> 
> Please make the whitespace here match the existing code, i.e.,
> line up the structure element names and comments.

ack!

> 
> > 
> >  	resource_size_t	barsz[PCI_SRIOV_NUM_BARS];	/* VF BAR size */
> >  	bool		drivers_autoprobe; /* Auto probing of VFs by driver */
> >  };
> > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> > index ef53774..aeaa10a 100644
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -180,6 +180,7 @@ static inline unsigned long decode_bar(struct pci_dev *dev, u32 bar)
> >  int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
> >  		    struct resource *res, unsigned int pos)
> >  {
> > +	int bar = res - dev->resource;
> >  	u32 l = 0, sz = 0, mask;
> >  	u64 l64, sz64, mask64;
> >  	u16 orig_cmd;
> > @@ -199,9 +200,13 @@ int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
> >  	res->name = pci_name(dev);
> >  
> >  	pci_read_config_dword(dev, pos, &l);
> > -	pci_write_config_dword(dev, pos, l | mask);
> > -	pci_read_config_dword(dev, pos, &sz);
> > -	pci_write_config_dword(dev, pos, l);
> > +	if (dev->is_virtfn) {
> > +		sz = dev->physfn->sriov->barsz[bar] & 0xffffffff;
> > +	} else {
> > +		pci_write_config_dword(dev, pos, l | mask);
> > +		pci_read_config_dword(dev, pos, &sz);
> > +		pci_write_config_dword(dev, pos, l);
> > +	}
> 
> This part is not like the others, i.e., the others are caching info
> from VF 0 in newly-added elements of struct pci_sriov.  This also uses
> information from struct pci_sriov, but it's qualitatively different,
> so it should be in a separate patch.

ack. Moved to a seperate patch.

> 
> > 
> >  	/*
> >  	 * All bits set in sz means the device isn't working properly.
> > @@ -241,9 +246,14 @@ int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
> >  
> >  	if (res->flags & IORESOURCE_MEM_64) {
> >  		pci_read_config_dword(dev, pos + 4, &l);
> > -		pci_write_config_dword(dev, pos + 4, ~0);
> > -		pci_read_config_dword(dev, pos + 4, &sz);
> > -		pci_write_config_dword(dev, pos + 4, l);
> > +
> > +		if (dev->is_virtfn) {
> > +			sz = (dev->physfn->sriov->barsz[bar] >> 32) & 0xffffffff;
> > +		} else {
> > +			pci_write_config_dword(dev, pos + 4, ~0);
> > +			pci_read_config_dword(dev, pos + 4, &sz);
> > +			pci_write_config_dword(dev, pos + 4, l);
> > +		}
> >  
> >  		l64 |= ((u64)l << 32);
> >  		sz64 |= ((u64)sz << 32);
> > @@ -332,6 +342,8 @@ static void pci_read_bases(struct pci_dev *dev, unsigned int howmany, int rom)
> >  	for (pos = 0; pos < howmany; pos++) {
> >  		struct resource *res = &dev->resource[pos];
> >  		reg = PCI_BASE_ADDRESS_0 + (pos << 2);
> > +		if (dev->is_virtfn && dev->physfn->sriov->barsz[pos] == 0)
> > +			continue;
> >  		pos += __pci_read_base(dev, pci_bar_unknown, res, reg);
> >  	}
> >  
> > @@ -1454,7 +1466,9 @@ int pci_setup_device(struct pci_dev *dev)
> >  	struct pci_bus_region region;
> >  	struct resource *res;
> >  
> > -	if (pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type))
> > +	if (dev->is_virtfn)
> > +		hdr_type = dev->physfn->sriov->hdr_type;
> > +	else if (pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type))
> >  		return -EIO;
> >  
> >  	dev->sysdata = dev->bus->sysdata;
> > @@ -1477,7 +1491,10 @@ int pci_setup_device(struct pci_dev *dev)
> >  		     dev->bus->number, PCI_SLOT(dev->devfn),
> >  		     PCI_FUNC(dev->devfn));
> >  
> > -	pci_read_config_dword(dev, PCI_CLASS_REVISION, &class);
> > +	if (dev->is_virtfn)
> > +		class = dev->physfn->sriov->class;
> > +	else
> > +		pci_read_config_dword(dev, PCI_CLASS_REVISION, &class);
> >  	dev->revision = class & 0xff;
> >  	dev->class = class >> 8;		    /* upper 3 bytes */
> >  
> > @@ -1517,8 +1534,13 @@ int pci_setup_device(struct pci_dev *dev)
> >  			goto bad;
> >  		pci_read_irq(dev);
> >  		pci_read_bases(dev, 6, PCI_ROM_ADDRESS);
> > -		pci_read_config_word(dev, PCI_SUBSYSTEM_VENDOR_ID, &dev->subsystem_vendor);
> > -		pci_read_config_word(dev, PCI_SUBSYSTEM_ID, &dev->subsystem_device);
> > +		if (dev->is_virtfn) {
> > +			dev->subsystem_vendor = dev->physfn->sriov->subsystem_vendor;
> > +			dev->subsystem_device = dev->physfn->sriov->subsystem_device;
> 
> PCIe r4.0, sec 9.3.4.1.13 requires that Subsystem Vendor ID be the
> same for the PF and all VFs, but sec 9.3.4.1.14 says the PF and VF may
> have different Subsystem IDs.  I know you're caching the Subsystem ID
> from VF 0, not the PF, but I don't see anything that requires all the
> VFs to have the same Subsystem ID.
> 
> I think the same is technically true for the Revision ID.  It might be
> reasonable to assume all the VFs have the same values, but maybe worth
> a comment.

I added a comment about that for the 3 fields.

> 
> > 
> > +		} else {
> > +			pci_read_config_word(dev, PCI_SUBSYSTEM_VENDOR_ID, &dev->subsystem_vendor);
> > +			pci_read_config_word(dev, PCI_SUBSYSTEM_ID, &dev->subsystem_device);
> > +		}
> >  
> >  		/*
> >  		 * Do the ugly legacy mode stuff here rather than broken chip
> > -- 
> > 2.7.4
> > 
> 
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux