Hi linux-arm-msm list, We have been using the mhi driver with a Sierra EM9191 5G modem module and have seen an occasional issue where the kernel would crash with messages showing "BUG: Bad page state" which we debugged further and found it was due to mhi_net_ul_callback freeing the same skb multiple times, further debugging tracked this down to a case where parse_xfer_event computed a dev_rp from the passed event's ev_tre which does not lie within the region of valid "in flight" transfers for the tre_ring. See the patch below for how this was detected. I believe that receiving such an event results in the loop which runs completion events for the transfers to re-run some completion callbacks as it walks all the way around the ring again to reach the invalid dev_rp position. What could cause parse_xfer_event to receive a transfer event with a tre pointer which would be outside the range of in-flight transfers? For example receiving events where the tre pointers do not only increase or receive a second event of types MHI_EV_CC_OVERFLOW, MHI_EV_CC_EOB, or MHI_EV_CC_EOT for a previous tre. The existing mhi driver code appears to assume that transfer events are received strictly in order such that you can never receive a transfer completion event for a transfer descriptor outside the current set of "in flight" transfers in the tre ring (those between the read pointer and write pointer). Thanks, Paul Davey ----8<---- >From bf3e3821a90b89758a30cefed662d32a78dcb766 Mon Sep 17 00:00:00 2001 From: Paul Davey <paul.davey@xxxxxxxxxxxxxxxxxxx> Date: Fri, 6 Aug 2021 15:36:05 +1200 Subject: [PATCH] bus: mhi: Detect invalid xfer event dev_rp Signed-off-by: Paul Davey <paul.davey@xxxxxxxxxxxxxxxxxxx> --- drivers/bus/mhi/core/main.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c index c67fd001ded1..238689a5dfc7 100644 --- a/drivers/bus/mhi/core/main.c +++ b/drivers/bus/mhi/core/main.c @@ -596,6 +596,7 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl, void *dev_rp; struct mhi_buf_info *buf_info; u16 xfer_len; + bool rp_valid; if (!is_valid_ring_ptr(tre_ring, ptr)) { dev_err(&mhi_cntrl->mhi_dev->dev, @@ -609,6 +610,15 @@ static int parse_xfer_event(struct mhi_controller *mhi_cntrl, if (dev_rp >= (tre_ring->base + tre_ring->len)) dev_rp = tre_ring->base; + if (tre_ring->rp < tre_ring->wp) + rp_valid = (dev_rp <= tre_ring->wp && dev_rp > tre_ring->rp); + else + rp_valid = !(dev_rp <= tre_ring->rp && dev_rp > tre_ring->wp); + + WARN_ON(!rp_valid); + if (!rp_valid) + goto end_process_tx_event; + result.dir = mhi_chan->dir; local_rp = tre_ring->rp; -- 2.32.0