On Wed, Aug 27, 2008 at 05:32:38PM +0800, Yang, Sheng wrote: > /** > + * pcie_do_flr - execute Function Level Reset on device if capability > exist > + * @dev: PCI device to execute FLR > + * > + * Returns 0 on success, or -EOPNOTSUPP if device don't support the > feature. > + * > + * Notice that on rare case, the Transaction Pending bit does not > clear before > + * FLR, we forced to continue the process, bless everything is going > well... > + */ The kerneldoc could use some work ... > +int pcie_do_flr(struct pci_dev *dev) > +{ > +#define FLR_CONF_SPACE_BACKUP_SIZE 16 > + u16 status; > + u32 cap, cs_backup[FLR_CONF_SPACE_BACKUP_SIZE]; > + int i; > + int exppos = pci_find_capability(dev, PCI_CAP_ID_EXP); > + unsigned long start; > + > + if (exppos) { > + pci_read_config_dword(dev, exppos + PCI_EXP_DEVCAP, &cap); > + if (!(cap & PCI_EXP_DEVCAP_FLR)) > + return -EOPNOTSUPP; > + } Surely you mean: if (!exppos) return -EOPNOTSUPP; pci_read_config_dword(dev, exppos + PCI_EXP_DEVCAP, &cap); if (!(cap & PCI_EXP_DEVCAP_FLR)) return -EOPNOTSUPP; I'm also not convinced that EOPNOTSUPP is the right error. ENOTTY might be more traditional. > + /* Backup PCI configuration space */ > + for (i = 0; i < FLR_CONF_SPACE_BACKUP_SIZE; i++) > + pci_read_config_dword(dev, i * 0x04, &cs_backup[i]); I'd use a plain '4' here instead of '0x04' > + /* Clear entire Command register */ > + pci_write_config_word(dev, PCI_COMMAND, 0); > + pci_write_config_word(dev, exppos + PCI_EXP_DEVCTL, 0); > + > + start = jiffies; > + /* Wait 100ms for Transaction Pending bit clean */ Shouldn't we lock out userspace from accessing the device too? ie call pci_block_user_cfg_access(). What happens when userspace tries to access a PCI device they have mapped? Is that just a "don't do that" scenario? > + while (jiffies - start < HZ / 10) { > + pci_read_config_word(dev, exppos + PCI_EXP_DEVSTA, &status); > + if (!(status & PCI_EXP_DEVSTA_TRPND)) > + break; > + cpu_relax(); > + } > + if (status & PCI_EXP_DEVSTA_TRPND) { > + printk(KERN_INFO > + "Transacation Pending bit haven't been cleaned. " typo "Transaction". > + "Longer wait time involved, 5 seconds at most.\n"); Actually, since we have a device here, something like: dev_info(&dev->dev, "Still busy after 100ms. Waiting " "5 seconds\n"); might be more appropriate. > + start = jiffies; > + while (jiffies - start <= HZ * 5) { > + pci_read_config_word(dev, > + exppos + PCI_EXP_DEVSTA, &status); > + if (!(status & PCI_EXP_DEVSTA_TRPND)) > + break; > + cpu_relax(); > + } These busy loops aren't great. For one thing, we're generating a lot of cycles on the PCIe fabric, which could be hindering the ability of the device to complete the transaction we're waiting for. Why not just ssleep(5); pci_read_config_word(dev, exppos + PCI_EXP_DEVSTA, &status); if (status & PCI_EXP_DEVSTA_TRPND) dev_info(&dev->dev, "Still busy after 5 seconds, "resetting anyay\n"); > + if (jiffies - start <= HZ * 5) > + printk(KERN_INFO > + "Transacation Pending bit have been cleaned.\n"); > + else > + printk(KERN_INFO > + "Fail to wait Transacation Pending bit clean," > + " force to FLR\n"); > + } > + pci_write_config_word(dev, exppos + PCI_EXP_DEVCTL, > + PCI_EXP_DEVCTL_BCR_FLR); > + mdelay(100); > + > + /* Restore PCI configuration space */ > + for (i = 0; i < FLR_CONF_SPACE_BACKUP_SIZE; i++) > + pci_write_config_dword(dev, i * 0x04, cs_backup[i]); I'm a little concerned by this blanket save/restore. For example, if you do an FLR, the device is required to clear its MSI state. But the Linux struct pci_dev is still going to have a record of the MSI state being set up, and that won't be restored. I think we probably want to treat FLR as a PCI hotplug operation (on this one devfn), first a remove and then an insert. That way any driver associated with this card will tear down its assciated data structures properly. This should also be tied into the error handling framework that Linas Vepstas put in. I don't have a clear idea how -- is this supposed to be used as _part_ of error recovery (ie function hung, reset it) or should this _trigger_ error recovery? -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html