Re: xhci_hcd 0000:11:00.0: HW died, polling stopped.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 1 May 2013, Martin Mokrejs wrote:

> > That's not how drivers work in Linux.  They don't unbind all by 
> > themselves; they wait until the bus-level code tells them to unbind.
> > xhci-hcd is not alone in this respect; all the drivers behave this way.
> 
> I don't believe that. From my tests only the USB3 express card suffered
> "the problem" unlike firewire_ohci and sata_sil24 -based cards.

That's not necessarily the same thing.

> Do you remember the thread https://lkml.org/lkml/2012/4/16/566
> ... where about 60 sec timeout was needed to have usb working again?

No.

> I think I saw meanwhile other talking about 30 sec delay but I believe this
> would all be easier if xhci_hcd did unbind itself from a dead device.
> 
> I am naively thinking that PCI has no way to detect a card was hot unplugged
> if e.g. hotplug was completely left out of a kernel .config or when acpiphp/pciehp
> don't work, for whatever reason. But, xhci_hcd has the unique advantage that it
> does polling and it know the device is dead. Probably same applies to uhci/ehci.
> I just don't believe if an upper level realizes a problem why it could not
> take an action.

These drivers _do_ take action when they realize their controller is
dead.  But it's not their job to recognize when the controller has been 
removed or when they should unbind, and they don't try to do it.

> Other drivers probably don't do polling, by design, so they are in another
> situation.
> 
> > 
> >> So what can be done so that the user does not have to run 
> >>
> >> echo 1 > /sys/bus/pci/devices/0000:11:00.0/remove
> >>
> >> manually? Couldn't xhci_hcd detect somehow that the device is dead or ejected?
> > 
> > It could detect that the device is dead.  In fact, it probably detects 
> > that now.  But even if it could tell that the device had been ejected, 
> > it would not unbind itself.
> > 
> > What can be done is to fix the PCIe core code so that it correctly
> > realizes when an eject takes place.
> 
> I believe once that will be fixed as I found that pciehp is broken
> in its action by pcie_aspm=off whereas it works when pcie_aspm=native.
> That in turn points to bad ASPM L0/L1 handling and seems similar to issues
> others had with PCIe LnkCtl on iwlwifi. That is somehow related to those
> OSC_ trickeries in acpi. Finally, seems other hit ASPM issues with Dell
> Vostro laptops. :( This will all hopefully get fixed. But I want usb
> fix as well. ;-)

When the PCI and ACPI layers are fixed, USB will automatically work
correctly too.

We don't work around problems in one driver by papering over them in 
another driver.  Instead, we fix the original driver.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux