On 06.06.2017 13:22, Shyam Sundar S K wrote:
On 6/2/2017 8:21 PM, Alan Stern wrote:
On Fri, 2 Jun 2017, Shyam Sundar S K wrote:
on AMD platforms with SNPS 3.1 USB controller if stop endpoint command is
issued the controller does not respond, when the EP is not in running
state. HW completes the command execution and reports
"Context State Error" completion code. This is as per the spec. However
HW on receiving the second command additionally marks EP to Flow control
state in HW which is RTL bug. The above bug causes the HW not to respond
to any further doorbells that are rung by the driver. This causes the EP
to not functional anymore and causes gross functional failures.
As a workaround, not to hit this problem, its better we check the EP state
and issue the stop EP command only when the EP is in running state.
Isn't there an unavoidable race? Suppose you check the EP state and
the controller says the endpoint is running. But then a STALL packet
is received and the controller stops the endpoint before you can issue
the Stop-EP command. How would you handle that?
Hi Alan,
Thank you for reviewing the patch.
I think, to avoid this kind of race conditions; its better to have a variable to keep track of internal
state changes as rightly pointed by Mathias (as per xhci specs, section 4.8.3).
But, Mathias mentioned that those changes might not be required for this workaround and that can be taken
at later point in time. So, I resubmitted the patch based on his latest suggestions.
Hi
The internal variable is just what xhci spec recommends as it says the output context is not immediately
updated for example at endpoint doorbell ring. It's to make sure we don't read stale values from the output context.
The race Alan refers to is a different case.
The endpoint might be halted just before the stop endpoint command is handled by hardware, there is no way
of tracking this from output contexts or local variables.
Working controllers will just give a context state error if we try to stop a halted endpoint.
I tried to ask about this in the first patch revision:
"I'm talking about the in xhci spec 4.6.9:"
" A Busy endpoint may asynchronously transition from the Running to the Halted or Error state due
to error conditions detected while processing TRBs. A possible race condition may occur if
software, thinking an endpoint is in the Running state, issues a Stop Endpoint Command however
at the same time the xHC asynchronously transitions the endpoint to the Halted or Error state. In
this case, a Context State Error may be generated for the command completion. Software may
verify that this case occurred by inspecting the EP State for Halted or Error when a Stop Endpoint
Command results in a Context State Error."
In addition to this patch you probably need to work around that issues as well.
-Mathias
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html