Re: [PATCH v2] xhci: Fix Link TRB DMA in command ring stopped completion event

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mathias,

On 10/21/2024 9:09 PM, Mathias Nyman wrote:
> On 21.10.2024 16.19, Faisal Hassan wrote:
>> During the aborting of a command, the software receives a command
>> completion event for the command ring stopped, with the TRB pointing
>> to the next TRB after the aborted command.
>>
>> If the command we abort is located just before the Link TRB in the
>> command ring, then during the 'command ring stopped' completion event,
>> the xHC gives the Link TRB in the event's cmd DMA, which causes a
>> mismatch in handling command completion event.
>>
>> To handle this situation, an additional check has been added to ignore
>> the mismatch error and continue the operation.
>>
>> Fixes: 7f84eef0dafb ("USB: xhci: No-op command queueing and irq
>> handler.")
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Faisal Hassan <quic_faisalh@xxxxxxxxxxx>
>> ---
>> Changes in v2:
>> - Removed traversing of TRBs with in_range() API.
>> - Simplified the if condition check.
>>
>> v1 link:
>> https://lore.kernel.org/all/20241018195953.12315-1-
>> quic_faisalh@xxxxxxxxxxx
>>
>>   drivers/usb/host/xhci-ring.c | 43 +++++++++++++++++++++++++++++++-----
>>   1 file changed, 38 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
>> index b2950c35c740..de375c9f08ca 100644
>> --- a/drivers/usb/host/xhci-ring.c
>> +++ b/drivers/usb/host/xhci-ring.c
>> @@ -126,6 +126,29 @@ static void inc_td_cnt(struct urb *urb)
>>       urb_priv->num_tds_done++;
>>   }
>>   +/*
>> + * Return true if the DMA is pointing to a Link TRB in the ring;
>> + * otherwise, return false.
>> + */
>> +static bool is_dma_link_trb(struct xhci_ring *ring, dma_addr_t dma)
>> +{
>> +    struct xhci_segment *seg;
>> +    union xhci_trb *trb;
>> +
>> +    seg = ring->first_seg;
>> +    do {
>> +        if (in_range(dma, seg->dma, TRB_SEGMENT_SIZE)) {
>> +            /* found the TRB, check if it's link */
>> +            trb = &seg->trbs[(dma - seg->dma) / sizeof(*trb)];
>> +            return trb_is_link(trb);
>> +        }
>> +
>> +        seg = seg->next;
>> +    } while (seg != ring->first_seg);
>> +
>> +    return false;
>> +}
>> +
>>   static void trb_to_noop(union xhci_trb *trb, u32 noop_type)
>>   {
>>       if (trb_is_link(trb)) {
>> @@ -1718,6 +1741,7 @@ static void handle_cmd_completion(struct
>> xhci_hcd *xhci,
>>         trace_xhci_handle_command(xhci->cmd_ring, &cmd_trb->generic);
>>   +    cmd_comp_code = GET_COMP_CODE(le32_to_cpu(event->status));
>>       cmd_dequeue_dma = xhci_trb_virt_to_dma(xhci->cmd_ring->deq_seg,
>>               cmd_trb);
>>       /*
>> @@ -1725,17 +1749,26 @@ static void handle_cmd_completion(struct
>> xhci_hcd *xhci,
>>        * command.
>>        */
>>       if (!cmd_dequeue_dma || cmd_dma != (u64)cmd_dequeue_dma) {
>> -        xhci_warn(xhci,
>> -              "ERROR mismatched command completion event\n");
>> -        return;
>> +        /*
>> +         * For the 'command ring stopped' completion event, there
>> +         * is a risk of a mismatch in dequeue pointers if we abort
>> +         * the command just before the link TRB in the command ring.
>> +         * In this scenario, the cmd_dma in the event would point
>> +         * to a link TRB, while the software dequeue pointer circles
>> +         * back to the start.
>> +         */
>> +        if (!(cmd_comp_code == COMP_COMMAND_RING_STOPPED &&
>> +              is_dma_link_trb(xhci->cmd_ring, cmd_dma))) {
> 
> 
> Do we in this COMP_COMMAND_RING_STOPPED case even need to check if
> cmd_dma != (u64)cmd_dequeue_dma, or if command ring stopped on a link TRB?
> 
> Could we just move the COMP_COMMAND_RING_STOPPED handling a bit earlier?
> 
> if (cmd_comp_code == COMP_COMMAND_RING_STOPPED) {
>     complete_all(&xhci->cmd_ring_stop_completion);
>         return;
> }
> 
> If I remember correctly it should just turn aborted command TRBs into
> no-ops,
> and restart the command ring
> 

Thanks for reviewing the changes!

Yes, you’re right. As part of restarting the command ring, we just ring
the doorbell.

If we move the event handling without validating the dequeue pointer,
wouldn’t it be a risk if we don’t check what the xHC is holding in its
dequeue pointer? If we are not setting it, it starts from wherever it
stopped. What if the dequeue pointer got corrupted or is not pointing to
any of the TRBs in the command ring?

> Thanks
> Mathias
> 

Thanks,
Faisal




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux