On Fri, May 24, 2024 at 6:28 AM Hector Martin <marcan@xxxxxxxxx> wrote: > > When multiple streams are in use, multiple TDs might be in flight when > an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for > each, to ensure everything is reset properly and the caches cleared. > Change the logic so that any N>1 TDs found active for different streams > are deferred until after the first one is processed, calling > xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to > queue another command until we are done with all of them. Also change > the error/"should never happen" paths to ensure we at least clear any > affected TDs, even if we can't issue a command to clear the hardware > cache, and complain loudly with an xhci_warn() if this ever happens. > > This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct > assumptions about number of rings per endpoint.") early on in the XHCI > driver's life, when stream support was first added. At that point, this > condition would cause TDs to not be given back at all, causing hanging > transfers - but no security bug. It was then identified but not fixed > nor made into a warning in commit 674f8438c121 ("xhci: split handling > halted endpoints into two steps"), which added a FIXME comment for the > problem case (without materially changing the behavior as far as I can > tell, though the new logic made the problem more obvious). > > Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some > cached cancelled URBs."), it was acknowledged again. This commit was > unfortunately not reviewed at all, as it was authored by the maintainer > directly. Had it been, perhaps a second set of eyes would've noticed > that it does not fix the bug, but rather just makes it (much) worse. > It turns the "transfers hang" bug into a "random memory corruption" bug, > by blindly marking TDs as complete without actually clearing them at all > nor moving the dequeue pointer past them, which means they aren't actually > complete, and the xHC will try to transfer data to/from them when the > endpoint resumes, now to freed memory buffers. > > This could have been a legitimate oversight, but apparently the commit > author was aware of the problem (yet still chose to submit it): It was > still mentioned as a FIXME, an xhci_dbg() was added to log the problem > condition, and the remaining issue was mentioned in the commit > description. The choice of making the log type xhci_dbg() for what is, > at this point, a completely unhandled and known broken condition is > puzzling and unfortunate, as it guarantees that no actual users would > see the log in production, thereby making it nigh undebuggable (indeed, > even if you turn on DEBUG, the message doesn't really hint at there > being a problem at all). > > It took me *months* of random xHC crashes to finally find a reliable > repro and be able to do a deep dive debug session, which could all have > been avoided had this unhandled, broken condition been actually reported > with a warning, as it should have been as a bug intentionally left in > unfixed (never mind that it shouldn't have been left in at all). > > > Another fix to solve clearing the caches of all stream rings with > > cancelled TDs is needed, but not as urgent. > > 3 years after that statement and 14 years after the original bug was > introduced, I think it's finally time to fix it. And maybe next time > let's not leave bugs unfixed (that are actually worse than the original > bug), and let's actually get people to review kernel commits please. > > Fixes xHC crashes and IOMMU faults with UAS devices when handling > errors/faults. Easiest repro is to use `hdparm` to mark an early sector > (e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop. > At least in the case of JMicron controllers, the read errors end up > having to cancel two TDs (for two queued requests to different streams) > and the one that didn't get cleared properly ends up faulting the xHC > entirely when it tries to access DMA pages that have since been unmapped, > referred to by the stale TDs. This normally happens quickly (after two > or three loops). After this fix, I left the `cat` in a loop running > overnight and experienced no xHC failures, with all read errors > recovered properly. Repro'd and tested on an Apple M1 Mac Mini > (dwc3 host). > > On systems without an IOMMU, this bug would instead silently corrupt > freed memory, making this a security bug (even on systems with IOMMUs > this could silently corrupt memory belonging to other USB devices on the > same controller, so it's still a security bug). Given that the kernel > autoprobes partition tables, I'm pretty sure a malicious USB device > pretending to be a UAS device and reporting an error with the right > timing could deliberately trigger a UAF and write to freed memory, with > no user action. > > Fixes: e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.") > Fixes: 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.") > Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps") > Cc: stable@xxxxxxxxxxxxxxx > Cc: security@xxxxxxxxxx > Signed-off-by: Hector Martin <marcan@xxxxxxxxx> > --- > drivers/usb/host/xhci-ring.c | 54 +++++++++++++++++++++++++++++++++++--------- > drivers/usb/host/xhci.h | 1 + > 2 files changed, 44 insertions(+), 11 deletions(-) > > diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c > index 575f0fd9c9f1..9c06502be098 100644 > --- a/drivers/usb/host/xhci-ring.c > +++ b/drivers/usb/host/xhci-ring.c > @@ -1034,13 +1034,27 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep) > break; > case TD_DIRTY: /* TD is cached, clear it */ > case TD_HALTED: > + case TD_CLEARING_CACHE_DEFERRED: > + if (cached_td) { > + if (cached_td->urb->stream_id != td->urb->stream_id) { > + /* Multiple streams case, defer move dq */ > + xhci_dbg(xhci, > + "Move dq deferred: stream %u URB %p\n", > + td->urb->stream_id, td->urb); > + td->cancel_status = TD_CLEARING_CACHE_DEFERRED; > + break; > + } > + > + /* Should never happen, at least try to clear the TD if it does */ > + xhci_warn(xhci, > + "Found multiple active URBs %p and %p in stream %u?\n", > + td->urb, cached_td->urb, > + td->urb->stream_id); > + td_to_noop(xhci, ring, cached_td, false); > + cached_td->cancel_status = TD_CLEARED; > + } > + > td->cancel_status = TD_CLEARING_CACHE; > - if (cached_td) > - /* FIXME stream case, several stopped rings */ > - xhci_dbg(xhci, > - "Move dq past stream %u URB %p instead of stream %u URB %p\n", > - td->urb->stream_id, td->urb, > - cached_td->urb->stream_id, cached_td->urb); > cached_td = td; > break; > } > @@ -1060,10 +1074,16 @@ static int xhci_invalidate_cancelled_tds(struct xhci_virt_ep *ep) > if (err) { > /* Failed to move past cached td, just set cached TDs to no-op */ > list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list, cancelled_td_list) { > - if (td->cancel_status != TD_CLEARING_CACHE) > + /* > + * Deferred TDs need to have the deq pointer set after the above command > + * completes, so if that failed we just give up on all of them (and > + * complain loudly since this could cause issues due to caching). > + */ > + if (td->cancel_status != TD_CLEARING_CACHE && > + td->cancel_status != TD_CLEARING_CACHE_DEFERRED) > continue; > - xhci_dbg(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n", > - td->urb); > + xhci_warn(xhci, "Failed to clear cancelled cached URB %p, mark clear anyway\n", > + td->urb); > td_to_noop(xhci, ring, td, false); > td->cancel_status = TD_CLEARED; > } > @@ -1350,6 +1370,7 @@ static void xhci_handle_cmd_set_deq(struct xhci_hcd *xhci, int slot_id, > struct xhci_ep_ctx *ep_ctx; > struct xhci_slot_ctx *slot_ctx; > struct xhci_td *td, *tmp_td; > + bool deferred = false; > > ep_index = TRB_TO_EP_INDEX(le32_to_cpu(trb->generic.field[3])); > stream_id = TRB_TO_STREAM_ID(le32_to_cpu(trb->generic.field[2])); > @@ -1436,6 +1457,8 @@ static void xhci_handle_cmd_set_deq(struct xhci_hcd *xhci, int slot_id, > xhci_dbg(ep->xhci, "%s: Giveback cancelled URB %p TD\n", > __func__, td->urb); > xhci_td_cleanup(ep->xhci, td, ep_ring, td->status); > + } else if (td->cancel_status == TD_CLEARING_CACHE_DEFERRED) { > + deferred = true; > } else { > xhci_dbg(ep->xhci, "%s: Keep cancelled URB %p TD as cancel_status is %d\n", > __func__, td->urb, td->cancel_status); > @@ -1445,8 +1468,17 @@ static void xhci_handle_cmd_set_deq(struct xhci_hcd *xhci, int slot_id, > ep->ep_state &= ~SET_DEQ_PENDING; > ep->queued_deq_seg = NULL; > ep->queued_deq_ptr = NULL; > - /* Restart any rings with pending URBs */ > - ring_doorbell_for_active_rings(xhci, slot_id, ep_index); > + > + if (deferred) { > + /* We have more streams to clear */ > + xhci_dbg(ep->xhci, "%s: Pending TDs to clear, continuing with invalidation\n", > + __func__); > + xhci_invalidate_cancelled_tds(ep); > + } else { > + /* Restart any rings with pending URBs */ > + xhci_dbg(ep->xhci, "%s: All TDs cleared, ring doorbell\n", __func__); > + ring_doorbell_for_active_rings(xhci, slot_id, ep_index); > + } > } > > static void xhci_handle_cmd_reset_ep(struct xhci_hcd *xhci, int slot_id, > diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h > index 6f4bf98a6282..aa4379bdb90c 100644 > --- a/drivers/usb/host/xhci.h > +++ b/drivers/usb/host/xhci.h > @@ -1276,6 +1276,7 @@ enum xhci_cancelled_td_status { > TD_DIRTY = 0, > TD_HALTED, > TD_CLEARING_CACHE, > + TD_CLEARING_CACHE_DEFERRED, > TD_CLEARED, > }; > > > --- > base-commit: a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6 > change-id: 20240524-xhci-streams-124e88db52e6 > Intense story, relatively simple fix. :) Reviewed-by: Neal Gompa <neal@xxxxxxxxx> -- 真実はいつも一つ!/ Always, there's only one truth!