Re: [PATCH] PCI: pciehp: Ignore spurious link inactive change when off

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 24, 2021 at 05:42:18PM -0500, Bjorn Helgaas wrote:
> On Fri, Apr 09, 2021 at 02:59:35PM -0600, Jon Derrick wrote:
> > When a specific x8 CEM card is bifurcated into x4x4 mode, and the
> > upstream ports both support hotplugging on each respective x4 device, a
> > slot management system for the CEM card requires both x4 devices to be
> > sysfs removed from the OS before it can safely turn-off physical power.
> > The implications are that Slot Control will display Powered Off status
> > for the device where the device is actually powered until both ports
> > have powered off.
> > 
> > When power is removed from the first half, the link remains active to
> > provide clocking while waiting for the second half to have power
> > removed. When power is then removed from the second half, the first half
> > starts shutdown sequence and will trigger a link status change event.
> > This is misinterpreted as an enabling event due to positive presence
> > detect and causes the first half to be re-enabled.
> > 
> > The spurious enable can be resolved by ignoring link status change
> > events when no link is active when in the off state.

Sorry for not responding earlier, I missed this patch.


> > --- a/drivers/pci/hotplug/pciehp_ctrl.c
> > +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> > @@ -265,6 +265,11 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
> >  		cancel_delayed_work(&ctrl->button_work);
> >  		fallthrough;
> >  	case OFF_STATE:
> > +		if ((events & PCI_EXP_SLTSTA_DLLSC) && !link_active) {
> > +			mutex_unlock(&ctrl->state_lock);
> > +			break;
> > +		}
> > +

I think this change will inadvertently ignore events that shouldn't be
ignored:  E.g., a DLLSC event may have been triggered by replacement of
the card in the slot and while Presence Detect State is 1, the link may
not yet be active.  The change above will cause not only the DLLSC but
also the PDC event to be ignored.

There are also reports of link flaps on card insertion and the above
change may result in the slot not being brought up even though it should.

The commit message sounds like powering down the CEM card takes longer
than expected.  We wait 1 second in set_slot_off() after disabling
slot power and that's apparently not sufficient.  The 1 second delay
is mandated by section 6.7.1.8 of the PCIe Base Spec.  If this card
needs a longer delay, a quirk should be added rather than changing
the algorithm for everyone.

Thanks,

Lukas



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux