Re: [RFC][PATCH] PCI: Enable Function Level Reset(FLR) for PCI-E

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday 27 August 2008 21:13:48 Matthew Wilcox wrote:
> On Wed, Aug 27, 2008 at 05:32:38PM +0800, Yang, Sheng wrote:

Thanks for the comments!
> >  /**
> > + * pcie_do_flr - execute Function Level Reset on device if
> > capability exist
> > + * @dev: PCI device to execute FLR
> > + *
> > + * Returns 0 on success, or -EOPNOTSUPP if device don't support
> > the feature.
> > + *
> > + * Notice that on rare case, the Transaction Pending bit does
> > not clear before
> > + * FLR, we forced to continue the process, bless everything is
> > going well...
> > + */
>
> The kerneldoc could use some work ...

Yeah...
>
> > +int pcie_do_flr(struct pci_dev *dev)
> > +{
> > +#define FLR_CONF_SPACE_BACKUP_SIZE	    16
> > +	u16 status;
> > +	u32 cap, cs_backup[FLR_CONF_SPACE_BACKUP_SIZE];
> > +	int i;
> > +	int exppos = pci_find_capability(dev, PCI_CAP_ID_EXP);
> > +	unsigned long start;
> > +
> > +	if (exppos) {
> > +		pci_read_config_dword(dev, exppos + PCI_EXP_DEVCAP, &cap);
> > +		if (!(cap & PCI_EXP_DEVCAP_FLR))
> > +			return -EOPNOTSUPP;
> > +	}
>
> Surely you mean:
>
> 	if (!exppos)
> 		return -EOPNOTSUPP;
> 	pci_read_config_dword(dev, exppos + PCI_EXP_DEVCAP, &cap);
> 	if (!(cap & PCI_EXP_DEVCAP_FLR))
> 		return -EOPNOTSUPP;
>
> I'm also not convinced that EOPNOTSUPP is the right error.  ENOTTY
> might be more traditional.

... Neglected that, thanks for point it out...

But I don't understand ENOTTY here...

> > +	/* Backup PCI configuration space */
> > +	for (i = 0; i < FLR_CONF_SPACE_BACKUP_SIZE; i++)
> > +		pci_read_config_dword(dev, i * 0x04, &cs_backup[i]);
>
> I'd use a plain '4' here instead of '0x04'
>
> > +	/* Clear entire Command register */
> > +	pci_write_config_word(dev, PCI_COMMAND, 0);
> > +	pci_write_config_word(dev, exppos + PCI_EXP_DEVCTL, 0);
> > +
> > +	start = jiffies;
> > +	/* Wait 100ms for Transaction Pending bit clean */
>
> Shouldn't we lock out userspace from accessing the device too?
> ie call pci_block_user_cfg_access().  What happens when userspace
> tries to access a PCI device they have mapped?  Is that just a
> "don't do that" scenario?
>

I supposed this can be done in userspace, but using a lock here is 
much better. :)

> > +	while (jiffies - start < HZ / 10) {
> > +		pci_read_config_word(dev, exppos + PCI_EXP_DEVSTA, &status);
> > +		if (!(status & PCI_EXP_DEVSTA_TRPND))
> > +			break;
> > +		cpu_relax();
> > +	}
> > +	if (status & PCI_EXP_DEVSTA_TRPND) {
> > +		printk(KERN_INFO
> > +			"Transacation Pending bit haven't been cleaned. "
>
> typo "Transaction".
>
> > +			"Longer wait time involved, 5 seconds at most.\n");
>
> Actually, since we have a device here, something like:
>
> 		dev_info(&dev->dev, "Still busy after 100ms.  Waiting "
> 				"5 seconds\n");
>
> might be more appropriate.
>
> > +		start = jiffies;
> > +		while (jiffies - start <= HZ * 5) {
> > +			pci_read_config_word(dev,
> > +					     exppos + PCI_EXP_DEVSTA, &status);
> > +			if (!(status & PCI_EXP_DEVSTA_TRPND))
> > +				break;
> > +			cpu_relax();
> > +		}
>
> These busy loops aren't great.  For one thing, we're generating a
> lot of cycles on the PCIe fabric, which could be hindering the
> ability of the device to complete the transaction we're waiting
> for.  Why not just
>
> 		ssleep(5);
> 		pci_read_config_word(dev, exppos + PCI_EXP_DEVSTA, &status);
> 		if (status & PCI_EXP_DEVSTA_TRPND)
> 			dev_info(&dev->dev, "Still busy after 5 seconds,
> 					"resetting anyay\n");
>

Yeah, it's better.

> > +		if (jiffies - start <= HZ * 5)
> > +			printk(KERN_INFO
> > +			"Transacation Pending bit have been cleaned.\n");
> > +		else
> > +			printk(KERN_INFO
> > +			"Fail to wait Transacation Pending bit clean,"
> > +			" force to FLR\n");
> > +	}
> > +	pci_write_config_word(dev, exppos + PCI_EXP_DEVCTL,
> > +				PCI_EXP_DEVCTL_BCR_FLR);
> > +	mdelay(100);
> > +
> > +	/* Restore PCI configuration space */
> > +	for (i = 0; i < FLR_CONF_SPACE_BACKUP_SIZE; i++)
> > +		pci_write_config_dword(dev, i * 0x04, cs_backup[i]);
>
> I'm a little concerned by this blanket save/restore.  For example,
> if you do an FLR, the device is required to clear its MSI state. 
> But the Linux struct pci_dev is still going to have a record of the
> MSI state being set up, and that won't be restored.

I think I can remove this part. The save/restore can be specific on 
each usage model, e.g. for error recover, we may not restore like 
this.

> I think we probably want to treat FLR as a PCI hotplug operation
> (on this one devfn), first a remove and then an insert.  That way
> any driver associated with this card will tear down its assciated
> data structures properly.

Maybe not. Can we hot-plug a function? It's function level reset, so 
if we have a 4 port NIC here, we just want to reset one function of 
it, but PCI hotplug do the reset of the whole device. So if we use 
another port on NIC for keeping connect with the machine, there is a 
problem.

>
> This should also be tied into the error handling framework that
> Linas Vepstas put in.  I don't have a clear idea how -- is this
> supposed to be used as _part_ of error recovery (ie function hung,
> reset it) or should this _trigger_ error recovery?

I think it's more fit for a error recovery? I will check error 
handling framework.

--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux