Re: PCI: hotplug: Erroneous removal of hotplug PCI devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/26/2019 6:12 AM, Lukas Wunner wrote:
> 
<snip>

> 
> Yes, that's why pci_bus_check_dev() polls the vendor ID register for
> up to 1 sec before it declares failure.  If that approach is not correct
> or if we deviate from the spec somewhere else then that should of course
> be fixed, so please let me know if so.
> 
> >> And Section 6.6.1 on Conventional Reset says "Software should use 100 ms
>> wait periods only if software enables CRS Software Visibility.
>> Otherwise, Completion timeouts, platform timeouts, or lengthy processor
>> instruction stalls may result."
>>
>> The "Completion timeouts" reference here is the key. Since CRS Software
>> Visibility is not used, the device still has 900 ms before it has to
>> return a Successful Completion if you issue a config read 100 ms after
>> DLLSC.
> 
> I'm truly sorry, I double-checked the code and it turns out Linux does
> enable CRS Software Visibility unconditionally for all bridges that
> support it:
> 
> pci_scan_bridge_extend()
>    pci_enable_crs()
> 
> My apologies for the misinformation in my previous e-mail.  Since your
> explanation hinges on CRS not being used, I'm not sure in how far it
> applies to the problem at hand.  (I do thank you though for the
> comprehensive explanation, I have a much better understanding now
> of the issue you're facing.)
> 
> 

Yes great, with CRS Software Visibility mechanism in place it sounds 
good.  Thanks for double-checking on it!  There is a weird corner case 
where as you insert a device the Data Link Layer becomes active (gold 
fingers on the connector just barely make contact) but as you continue 
to insert the device the link goes back down then back up once it's 
fully inserted (we've captured this and other similar oddities in the 
lab).  The link down/up would reset the device and so in this case, it 
looks like the logic could do a config read before the device is ready. 
 From platform side we usually wait 1 second around hot-plug events for 
everything to reach steady state since you don't otherwise know when the 
mechanical action around hot-insert/hot-remove has ended.

The other way to handle this is via DPC since it will avoid link coming 
back up on its own but more work is needed to support DPC on some 
systems.  The coming ACPI additions should hopefully allow DPC to be 
enabled on such systems.

>>> Is this product being shipped or can it still be changed?
>>
>> It's shipping.  It could be changed with a firmware change to re-enable
>> in-band presence but then we'd have the issue where hot-plug breaks when
>> removing devices when link is in certain states like L1 so that is not a
>> viable solution.
> 
> An alternative idea would be to change firmware to always drive PCIe PRSNT#
> high as soon as U.2 PRSNT#/IfDet# is sensed, regardless whether it's NVMe
> or not.  Linux will try to bring up the slot and if it can't be brought up
> because it's a SAS drive, all you'll get is an error message in dmesg,
> but you shouldn't see other negative effects.

We used to do this but this error and similar in other OSes is what 
prompted the 500 ms delay circuitry.

> 
> I think the proper solution would be to add a microcontroller to each
> U.2 slot on your motherboard which snoops on the PCIe pins to determine
> whether communication is taking place and sample the PRSNT#/IfDet# pins
> if so.  This leads to higher BOM cost and development cost of course.
> > Arguably your platform firmware-based solution tries avoid that cost
> by simulating what a microcontroller would do but can't really achieve
> the same result because it lacks access to the PCIe pins.  Vendors such
> as Apple do go to the lengths of adding microcontrollers in similar cases,
> e.g. Apple used one on their motherboards with first generation Thunderbolt
> controllers:
> 
> Because the physical port can be used either to attach a Thunderbolt
> or a DisplayPort device, they added an NXP LP1112A ARM controller
> which snoops on the pins and drives a multiplexer to route signals
> either to the Thunderbolt controller or the GPU.  See page 75 of
> these schematics ("DisplayPort/T29 A MUXing"):
> 
> https://forums.macrumors.com/attachments/a1278-j30-820-3115-b-051-9058-pdf.451409/
> 
> Newer Thunderbolt controllers integrate that functionality so the
> BOM cost and development cost is no longer necessary today.
> 
> 

Thanks for the links.  And yes, we looked at things like this too but 
they're prohibitively expensive. It's one thing if you have a single 
thunderbolt connector but we have 24 or more drives.

But we do have other less expensive ways to detect when a link has 
trained but those mechanisms or the microcontroller idea above all 
require hardware changes so won't be possible on the systems already in 
the field.  Future platforms + DPC should handle it better.

>>> Unfortunately I don't have access to those ECNs or the non-public
>>> git repo.
>>
>> The PCIe Async Hot-Plug ECN has most of the new stuff. It has been
>> released and has the same level of access as the PCIe Base Spec so if
>> you have access to the PCIe base spec then you should have access to
>> this ECN.
> 
> Unfortunately I'm not associated with a member of the PCISIG,
> so my level of access to the PCIe Base Spec is constrained
> to leaked documents found on Ukrainian websites. ;-)
>

Maybe you'll get lucky and they'll leak the ECN too :-)

> Thanks,
> 
> Lukas
> 





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux