Re: [PATCH 2/2] pci: Add ignore indicator quirk for devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 18, 2016 at 02:56:56PM -0500, Bjorn Helgaas wrote:
> On Wed, Aug 17, 2016 at 07:09:51PM -0400, Keith Busch wrote:
> > We've augmented 'ledmon' with libpci and toggles these indicators very
> > similar to how 'setpci' can change PCI config registers. It is done only
> > after the storage device is up and registered, which is well outside
> > the time when pciehp is actively using the slot control.
> 
> I assume this means ledmon writes Slot Control to manage the LEDs.

Correct, the registers that are defined for AttnCtrl and PwrCtrl are
what's being redefined for the new pattern.

We are definitely not dead set on requiring we let ledmon have direct
access to SlotCtl through libpci. That's just the one way we thought of
that didn't require new kernel dependencies.

> That sounds pretty scary to me because it's impossible to enforce the
> assumption that ledmon only uses Slot Control when pciehp isn't using
> it.  Hotplug includes surprise events, and I think pciehp is entitled
> to assume it is the sole owner of Slot Control.

I understand the concern here. You don't want independent programs
vying to mask/unmask bits in SlotCtl and end up with a state neither
intended. The only scenario that I can come up with where that could
happen is if ledmon changes the LED on a slot at the same time the drive
is removed, but there is no MRL or attention buttons on this hardware,
and the rest that we do care about looks totally harmless on these slots.

This condition also isn't really new. While probably not recommended,
I could blink the standard Attn and Pwr LED's like this (user, of course,
assumes all the risks):

  # setpci -s <B:D.f> CAP_EXP+18.w=280:3c0

It's basically the same as what we'd have ledmon do, but with a less
esoteric syntax:

  # ledctl locate=/dev/nvme0n1

[ledmon invokes ledctl from a daemon, but ledctl can run on its own]

> I wonder if there's some way to implement the LED control in pciehp,
> e.g., by enhancing pciehp_set_attention_status().  I assume the
> desired indicator state is coming from some in-kernel storage driver,
> and ledmon learns about that somehow, then fiddles with Slot Control
> behind the kernel's back?  That looping from kernel to user and back
> to kernel sounds a little dodgy and fragile.

That is definitely an interesting possibility if you are open to
this. We'd "just" need the kernel to comprehend these vendor specific
bit patterns for this particular generation of hardware.

If you're okay with that, then I'm more than happy to propose a patch
for consideration. We can then have ledmon subscribe to the sysfs entry
point instead of going through libpci, no problem.
 
> Can you share any details about how you plan to implement this on the
> next generation of devices?  Maybe we can enhance pciehp indicator
> control in a way that supports both Sky Lake and future devices.

One possibility is to define through PCI-SIG
SlotCap2 and SlotCtl2 with this kind of LED control.

Another possibilities I hear about through the NVMe-MI working group
is creating similar type of devices as SES (SCSI Enclosure Services)
for PCIe SSD enclosures.



Finally, at the risk of digressing this thread, but if I may switch
topic back to suppressing the indicator presence fields...

Since these are specific to VMD domains, there are folks at Intel who
want to see the VMD driver mask off these capabilities from the driver.
There is never a case where a slot in this domain should advertise these,
so masking off the bits at the config read level would achieve the desired
result. It'd be isolated to the VMD driver, and the quirk wouldn't need
to be maintained as addtional vendor devices implementing the same quirk
are made.

Below is the diff for *that* proposal, and the developer of this patch
wanted to know if you may consider this over the quirk that was previously
written. This looks odd to me, but I can't argue against that it achieves
what the device maker desires.

Thanks!
Keith

---
diff --git a/arch/x86/pci/vmd.c b/arch/x86/pci/vmd.c
index b73da50..aa2791f 100644
--- a/arch/x86/pci/vmd.c
+++ b/arch/x86/pci/vmd.c
@@ -452,6 +452,7 @@ static int vmd_pci_read(struct pci_bus *bus,
unsigned int devfn, int reg,
 {
 	struct vmd_dev *vmd = vmd_from_bus(bus);
 	char __iomem *addr = vmd_cfg_addr(vmd, bus, devfn, reg, len);
+	struct pci_dev *dev;
 	unsigned long flags;
 	int ret = 0;
 
@@ -474,6 +475,17 @@ static int vmd_pci_read(struct pci_bus *bus,
unsigned int devfn, int reg,
 		break;
 	}
 	spin_unlock_irqrestore(&vmd->cfg_lock, flags);
+
+	list_for_each_entry(dev, &bus->devices, bus_list) {
+		if (dev->devfn == devfn) {
+			if (dev->pcie_cap &&
+					reg == dev->pcie_cap + PCI_EXP_SLTCAP)
+				*value &= ~(PCI_EXP_SLTCAP_AIP |
+					    PCI_EXP_SLTCAP_PIP);
+			break;
+		}
+	}
+
 	return ret;
 }
 
--
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux