Re: [PATCH 4/5] xhci: Fix command ring pointer corruption while aborting a command

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> This patch passes my test. Can you please clarify the below question that is
> bothering me.

Sure, currently this works as we only have one command ring segment.
But lets look at a theoretical case with two segments far apart where
the last command on a segment is aborted

> 
> Here crcr points to the DMA address of the command which is getting
> executed (soon will be aborted) by the xHC. After the ring is stopped,
> we want xHC to execute the *next* command.
> 
> Is it guaranteed that the upper 32 bits of previous and next commands will
> be same? Because when the issue happens, xHC takes the 32 bits of crcr
> variable and update it's internal pointer from which it will fetch the commands
> next time the ring is started. I think it is guaranteed because the upper 32
> bits only may change when we cross the segments but in which case there will
> be a link TRB in the middle. Since xHC command ring is stopped, it won't be
> fetching the link TRB until door bell is rung again. Is my understanding correct?

Good point. 

This could be an issue as xHC will move the internal pointer to the next
command TRB (past link) right after generating a "command aborted" event,
before generating the "command ring stopped" event. The command ring stopped
event should thus point to the next command TRB to handle.

I'll assume that the "command aborted" and "command ring stopped" events are generated
before xHC updates the internal pointer based on CRCR write.

The CRCR write may thus move the internal dequeue pointer back to previous segment.

This may mess things up, even if the interrupt handler turns the aborted command TRB
to a no-op, it's possible the partial CRCR update could cause internal pointer to 
point to first command of old segment. 

> 
> Also, what if there is a race with xHC just finishing this command (which we
> are currently aborting) and next link TRB is traversed and processing the next
> command in a different segment. For some reason, we could not update our
> deq pointer and in middle of aborting the command (which is already completed)
> and updating the higher 32 bit with the previous deq segment. This is a
> hypothetical question as we are only using 1 segment for the command ring.

In this case we won't see the "command aborted" event. Only a "command ring
stopped" event. and yes, the CRCR write could cause the internal dequeue pointer
to be moved back to the old segment, and we have the same issues as in the 
previous case.

These two cases could probably solved by writing the next command pointer to CRCR
instead.

Only problem I see with this is if xHC hasn't even fetched the command TRB we are
aborting. Then ring will stop at this last command TRB on segment.
Writing the next command TRB pointer to CRCR forces xHC to move to the next
segment without xHC processing the Link TRB, and cycle bit might get out of sync.

Hard to imagine how we end up in this situation.  

Then there is the third case where xHC handled multiple command TRBs, but interrupts
are blocked so driver is unaware of their completion.
Here its possible we write the wrong segment to CRCR even if we use the next
command TRB pointer. xHC could be way past what driver thinks is the next command.

For this case we proably need to check for pending interrupts and unhandled command
completions on the event ring before aborting the command.
But that is probably another patch.

But for now the best solution seems to be just to write next command TRB
pointer to CRCR during abort.

Thanks
-Mathias



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux