Re: [PATCH v4 1/1] xhci: Correctly handle last TRB of isoc TD on Etron xHCI host

Mathias Nyman <mathias.nyman@xxxxxxxxxxxxxxx> · Fri, 7 Feb 2025 14:06:54 +0200

On 6.2.2025 0.42, Michał Pecio wrote:
Not giving back the TD when we get an event for the last TRB in the
TD sounds risky. With this change we assume all old and future ETRON
hosts will trigger this additional spurious success event.

error_mid_td can cope with hosts which don't produce the extra success
event, it was done this way to deal with buggy NECs. The cost is one
more ESIT of latency on TDs with error.

It makes giving back the TD depend on a future event we can't guarantee.

I still think it better fits the spurious success case.
It's not an error mid TD, it's a spurious success event sent by host
after a completion (error) event for the last TRB in the TD.

Making this change to error_mid_td code also makes that code more
confusing and harder to follow.

I think we could handle this more like the XHCI_SPURIOUS_SUCCESS case
seen with short transfers, and just silence the error message.

That's a little dodgy because it frees the TD before the HC is
completely done with it. *Probably* no problem with data buffers
(no sensible reason to DMA into them after an earlier error), but
we could overwrite the transfer ring in rare cases and IDK if it
would or wouldn't cause problems in this particular case.

We did get an event for the last TRB in the TD, so in normal cases
this TD should be considered complete, and given back.

I don't think the controller has any reason to touch data buffers at
this stage either. Can't recall any iommu/dma issues related to this.

Same applies to the "short packet" case existing today. I thought
about fixing it, but IIRC I ran into some differences between HCs
or out of spec behavior and it got tricky.

For the short transfer case this is more valid concern. Here we give
back the TD after an event mid TD, and we know hardware might still
walk the rest of the TD. It shouldn't touch data buffers either as
short transfer indicates all data has been written.

Maybe it would make sense to separate giveback (and freeing of the
data buffer by class drivers) from transfer ring inc_deq(). Do the
former when we reasonably believe the HC won't touch the buffers
anymore, do the latter when we are sure that it's in the next TD.

This sounds reasonable, makes sense to keep the software dequeue
pointer where hardware last reported its position. Currently we
advance it to where we assume hardware will be next.

But this is a separate project.
Might need some work around in the driver.

Thanks
Mathias