On 6/7/2017 4:45 PM, Mathias Nyman wrote: > Hi > > The internal variable is just what xhci spec recommends as it says the output context is not immediately > updated for example at endpoint doorbell ring. It's to make sure we don't read stale values from the output context. > > The race Alan refers to is a different case. > The endpoint might be halted just before the stop endpoint command is handled by hardware, there is no way > of tracking this from output contexts or local variables. > > Working controllers will just give a context state error if we try to stop a halted endpoint. Agreed. In case of the linux xhci driver, two times stop EP commands are queued up. one within the loop: xhci_queue_stop_endpoint(xhci, command, slot_id, i, suspend); and another out of the loop: xhci_queue_stop_endpoint(xhci, cmd, slot_id, 0, suspend); I could not comprehend what is the actual need to have two stop endpoint command queued up. In our case, when we send the first stop endpoint command it is returning with Context State Error and for the next stop EP HW additionally marks EP to Flow control state in HW which is RTL bug. I did some experiments to see the impact of not queuing the first stop EP command i.e. xhci_queue_stop_endpoint(xhci, command, slot_id, i, suspend); the one which is inside the loop Looks like not queuing this command does not have any impact on the functionality on the connected hub. So, based on this; I have pushed the v2 patch which guards the first stop EP command. Also, for the first stop EP command, while allocation we do not set the 'allocate_completion' flag to true. command = xhci_alloc_command(xhci, false, false, GFP_NOWAIT); So, ideally we cannot check its completion state with command->status. Atleast for this SNPS IP issue, we need to prevent this first stop EP command, either by checking the ep_state or by adding a quirk specific to this revision of IP from SNPS so that it does not hit the internal RTL bug. Kindly let me know your thoughts. > > I tried to ask about this in the first patch revision: > > "I'm talking about the in xhci spec 4.6.9:" > > " A Busy endpoint may asynchronously transition from the Running to the Halted or Error state due > to error conditions detected while processing TRBs. A possible race condition may occur if > software, thinking an endpoint is in the Running state, issues a Stop Endpoint Command however > at the same time the xHC asynchronously transitions the endpoint to the Halted or Error state. In > this case, a Context State Error may be generated for the command completion. Software may > verify that this case occurred by inspecting the EP State for Halted or Error when a Stop Endpoint > Command results in a Context State Error." > I think this scenario is true even with and without this patch. Correct ? > In addition to this patch you probably need to work around that issues as well. > > -Mathias > > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html