On 2020/11/19 14:09, Greg KH Wrote: > On Thu, Nov 19, 2020 at 09:50:20AM +0800, liulongfang wrote: >> On 2020/11/19 0:21, Mathias Nyman Wrote: >>> On 18.11.2020 11.14, Greg KH wrote: >>>> On Wed, Nov 18, 2020 at 05:04:36PM +0800, liulongfang wrote: >>>>> On 2020/11/18 15:02, Greg KH Wrote: >>>>>> On Wed, Nov 18, 2020 at 11:36:13AM +0800, Longfang Liu wrote: >>>>>>> The HCE(Host Controller Error) event has been defined in >>>>>>> the XHCI driver but has not been used. If we want to use >>>>>>> the HCE event to reset the controller, can we implement >>>>>>> it in the interrupt function as follows: >>>>>>> >>>>>>> xhci_irq() >>>>>>> |----xhci_halt() >>>>>>> |----xhci_shutdown() >>>>>>> |----xhci_start() >>>>>>> |----xhci_run() >>>>>>> >>>>>>> Signed-off-by: Longfang Liu <liulongfang@xxxxxxxxxx> >>>>>>> --- >>>>>>> drivers/usb/host/xhci-ring.c | 10 ++++++++++ >>>>>>> 1 file changed, 10 insertions(+) >>>>>> >>>>>> $ ./scripts/get_maintainer.pl --file drivers/usb/host/xhci-ring.c >>>>>> Mathias Nyman <mathias.nyman@xxxxxxxxx> (supporter:USB XHCI DRIVER) >>>>>> Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> (supporter:USB SUBSYSTEM) >>>>>> linux-usb@xxxxxxxxxxxxxxx (open list:USB XHCI DRIVER) >>>>>> linux-kernel@xxxxxxxxxxxxxxx (open list) >>>>>> >>>>>> Any reason to not include the maintainer of the xhci driver here? >>>>> OK, I will include the maintainer in the next patch. >>>>>> >>>>>>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c >>>>>>> index 2c255d0..87b3a40 100644 >>>>>>> --- a/drivers/usb/host/xhci-ring.c >>>>>>> +++ b/drivers/usb/host/xhci-ring.c >>>>>>> @@ -2857,6 +2857,16 @@ irqreturn_t xhci_irq(struct usb_hcd *hcd) >>>>>>> goto out; >>>>>>> } >>>>>>> >>>>>>> + if (status & STS_HCE) { >>>>>>> + xhci_warn(xhci, "WARNING: Host Controller Error\n"); >>>>>>> + xhci_halt(xhci); >>>>>>> + xhci_shutdown(hcd); >>>>>>> + xhci_start(xhci); >>>>>>> + xhci_run(hcd); >>>>>>> + ret = IRQ_HANDLED; >>>>>>> + goto out; >>>>>>> + } >>> This won't work at all. It doesn't reset the xHC which is the one thing needed to >>> recover from a HCE. This would deadlock immediately. >>> Many of the above functions shouldn't be called from interrupt context, and if >>> HCE is set we probably even won't get any interrupts. >>> >>> A reset will set all registers to their initial values, and need to be >>> re-initialized. Nothing is freed or re-allocated, registrs are not re-initialized >>> here. >>> >>> Please revisit the xhci spec about HCE, and HCRST, and a more detail look >>> at the xhci driver, and test the code before submitting. >>> >>>>>>> + >>>>>> >>>>>> Does this fix a real problem for you? Are you sure we will not suddenly >>>>>> start resetting devices that were working properly and sending this >>>>>> error incorrectly? How did you test this? >>>>>> >>>>>> thanks, >>>>>> >>>>>> greg k-h >>>>>> . >>>>>> >>>>> Yes, we want to add a RAS feature to the USB, >>>> >>>> What is "RAS"? >>>> >>>>> Use the HCE event to trigger the reset operation of the USB controller. >>>> >>>> Is that allowed by the XHCI specification? >>>> >>>>> By searching for the current xhci driver, the driver did not handle HCE event. >>>>> In fact, I am not sure if other operations will cause HCE, The HCE event is >>>>> used to reset the USB controller according to the definition of the event. >>>> >>>> What generates that event? Do existing controllers do that today? What >>>> causes it? >>>> >>> >>> It's not an event, it's a controller internal error state. >>> It's probably not very useful to check for it in the interrupt handler as >>> the xHC hw ceases all activity when it sets the HCE bit, including sending >>> interrupts. >>> >>>> >From the spec: >>> >>> 4.24.1 Internal Errors >>> The Host Controller Error (HCE) flag is asserted when an internal xHC error is >>> detected that exclusively affects the xHC. When the HCE flag is set to ‘1’ the xHC >>> shall cease all activity. Software response to the assertion of HCE is to reset the >>> xHC (HCRST = ‘1’) and reinitialize it >>> >>>>> I test this reset operation through the Sysfs file, but have not tested all usage scenarios. >>>> >>>> What sysfs file? >>> >>> I'm also interested in this, and to know more about RAS? >>> >>> Thanks >>> -Mathias >>> . >>> >> RAS: Reliablity, Availability and Serviceability. > > That's a marketing phrase, what specifically are you talking about when > it comes to the xhci driver? > It means that when the hardware detects HCE, the USB controller could recover. >> sysfs: a virtual file system based on memory, it's similar as proc. > > I am very aware of what sysfs is, look at the MAINTAINERS entry for it :) > >> Use DEVICE_ATTR or DRIVER_ATTR to create files to the /sys directory > > Yes, but what file did you use for this? I don't see one in the kernel > that can cause this, do you? > > thanks, > > greg k-h > . > I found that there is debugfs in the USB xhci driver, and the reset test can be realized by adding the debugfs file thanks, Longfang.