Re: [RFC][PATCH] PCI: Enable Function Level Reset(FLR) for PCI-E

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 27, 2008 at 05:32:38PM +0800, Yang, Sheng wrote:
>  /**
> + * pcie_do_flr - execute Function Level Reset on device if capability 
> exist
> + * @dev: PCI device to execute FLR
> + *
> + * Returns 0 on success, or -EOPNOTSUPP if device don't support the 
> feature.
> + *
> + * Notice that on rare case, the Transaction Pending bit does not 
> clear before
> + * FLR, we forced to continue the process, bless everything is going 
> well...
> + */

The kerneldoc could use some work ...

> +int pcie_do_flr(struct pci_dev *dev)
> +{
> +#define FLR_CONF_SPACE_BACKUP_SIZE	    16
> +	u16 status;
> +	u32 cap, cs_backup[FLR_CONF_SPACE_BACKUP_SIZE];
> +	int i;
> +	int exppos = pci_find_capability(dev, PCI_CAP_ID_EXP);
> +	unsigned long start;
> +
> +	if (exppos) {
> +		pci_read_config_dword(dev, exppos + PCI_EXP_DEVCAP, &cap);
> +		if (!(cap & PCI_EXP_DEVCAP_FLR))
> +			return -EOPNOTSUPP;
> +	}

Surely you mean:

	if (!exppos)
		return -EOPNOTSUPP;
	pci_read_config_dword(dev, exppos + PCI_EXP_DEVCAP, &cap);
	if (!(cap & PCI_EXP_DEVCAP_FLR))
		return -EOPNOTSUPP;

I'm also not convinced that EOPNOTSUPP is the right error.  ENOTTY might
be more traditional.

> +	/* Backup PCI configuration space */
> +	for (i = 0; i < FLR_CONF_SPACE_BACKUP_SIZE; i++)
> +		pci_read_config_dword(dev, i * 0x04, &cs_backup[i]);

I'd use a plain '4' here instead of '0x04'

> +	/* Clear entire Command register */
> +	pci_write_config_word(dev, PCI_COMMAND, 0);
> +	pci_write_config_word(dev, exppos + PCI_EXP_DEVCTL, 0);
> +
> +	start = jiffies;
> +	/* Wait 100ms for Transaction Pending bit clean */

Shouldn't we lock out userspace from accessing the device too?
ie call pci_block_user_cfg_access().  What happens when userspace tries
to access a PCI device they have mapped?  Is that just a "don't do that"
scenario?

> +	while (jiffies - start < HZ / 10) {
> +		pci_read_config_word(dev, exppos + PCI_EXP_DEVSTA, &status);
> +		if (!(status & PCI_EXP_DEVSTA_TRPND))
> +			break;
> +		cpu_relax();
> +	}
> +	if (status & PCI_EXP_DEVSTA_TRPND) {
> +		printk(KERN_INFO
> +			"Transacation Pending bit haven't been cleaned. "

typo "Transaction".

> +			"Longer wait time involved, 5 seconds at most.\n");

Actually, since we have a device here, something like:

		dev_info(&dev->dev, "Still busy after 100ms.  Waiting "
				"5 seconds\n");

might be more appropriate.

> +		start = jiffies;
> +		while (jiffies - start <= HZ * 5) {
> +			pci_read_config_word(dev,
> +					     exppos + PCI_EXP_DEVSTA, &status);
> +			if (!(status & PCI_EXP_DEVSTA_TRPND))
> +				break;
> +			cpu_relax();
> +		}

These busy loops aren't great.  For one thing, we're generating a lot of
cycles on the PCIe fabric, which could be hindering the ability of the
device to complete the transaction we're waiting for.  Why not just

		ssleep(5);
		pci_read_config_word(dev, exppos + PCI_EXP_DEVSTA, &status);
		if (status & PCI_EXP_DEVSTA_TRPND)
			dev_info(&dev->dev, "Still busy after 5 seconds,
					"resetting anyay\n");

> +		if (jiffies - start <= HZ * 5)
> +			printk(KERN_INFO
> +			"Transacation Pending bit have been cleaned.\n");
> +		else
> +			printk(KERN_INFO
> +			"Fail to wait Transacation Pending bit clean,"
> +			" force to FLR\n");
> +	}
> +	pci_write_config_word(dev, exppos + PCI_EXP_DEVCTL,
> +				PCI_EXP_DEVCTL_BCR_FLR);
> +	mdelay(100);
> +
> +	/* Restore PCI configuration space */
> +	for (i = 0; i < FLR_CONF_SPACE_BACKUP_SIZE; i++)
> +		pci_write_config_dword(dev, i * 0x04, cs_backup[i]);

I'm a little concerned by this blanket save/restore.  For example, if
you do an FLR, the device is required to clear its MSI state.  But the
Linux struct pci_dev is still going to have a record of the MSI state
being set up, and that won't be restored.

I think we probably want to treat FLR as a PCI hotplug operation (on
this one devfn), first a remove and then an insert.  That way any driver
associated with this card will tear down its assciated data structures
properly.

This should also be tied into the error handling framework that
Linas Vepstas put in.  I don't have a clear idea how -- is this supposed
to be used as _part_ of error recovery (ie function hung, reset it) or
should this _trigger_ error recovery?

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux