Hi Jeffrey, On Wed, 10 Mar 2021 at 17:19, Jeffrey Hugo <jhugo@xxxxxxxxxxxxxx> wrote: > > On 3/10/2021 4:38 AM, Loic Poulain wrote: > > Some buggy hardwares (e.g sdx24) may report the current command > > ring wp pointer instead of the command completion pointer. It's > > obviously wrong, causing completion timeout. We can however deal > > with that situation by completing the cmd n-1 element, which is > > what the device actually completes. > > > > Signed-off-by: Loic Poulain <loic.poulain@xxxxxxxxxx> > > --- > > drivers/bus/mhi/core/main.c | 18 ++++++++++++++++++ > > 1 file changed, 18 insertions(+) > > > > diff --git a/drivers/bus/mhi/core/main.c b/drivers/bus/mhi/core/main.c > > index 16b9640..3e3c520 100644 > > --- a/drivers/bus/mhi/core/main.c > > +++ b/drivers/bus/mhi/core/main.c > > @@ -707,6 +707,7 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl, > > { > > dma_addr_t ptr = MHI_TRE_GET_EV_PTR(tre); > > struct mhi_cmd *cmd_ring = &mhi_cntrl->mhi_cmd[PRIMARY_CMD_RING]; > > + struct device *dev = &mhi_cntrl->mhi_dev->dev; > > struct mhi_ring *mhi_ring = &cmd_ring->ring; > > struct mhi_tre *cmd_pkt; > > struct mhi_chan *mhi_chan; > > @@ -714,6 +715,23 @@ static void mhi_process_cmd_completion(struct mhi_controller *mhi_cntrl, > > > > cmd_pkt = mhi_to_virtual(mhi_ring, ptr); > > > > + if (unlikely(cmd_pkt == mhi_ring->wp)) { > > + /* Some buggy hardwares (e.g sdx24) sometimes report the current > > + * command ring wp pointer instead of the command completion > > + * pointer. It's obviously wrong, causing completion timeout. We > > + * can however deal with that situation by completing the cmd > > + * n-1 element. > > + */ > > + void *ring_ptr = (void *)cmd_pkt - mhi_ring->el_size; > > + > > + if (ring_ptr < mhi_ring->base) > > + ring_ptr += mhi_ring->len; > > + > > + cmd_pkt = ring_ptr; > > + > > + dev_warn(dev, "Bad completion pointer (ptr == ring_wp)\n"); > > Is there value in having this warning every time? I wonder if a _once > version would be better to not flood the kernel log. Although this is > only for commands, which shouldn't be frequent, so maybe that is the > implicit rate limiter. > > What do you think? As you said it's kind of self rate-limited because of the unfrequent command operations, mostly for starting and stopping channels. A _once variant would hide the issue a bit, and probably not annoying enough to raise curiosity. > > > + } > > + > > chan = MHI_TRE_GET_CMD_CHID(cmd_pkt); > > mhi_chan = &mhi_cntrl->mhi_chan[chan]; > > write_lock_bh(&mhi_chan->lock); > > > > > -- > Jeffrey Hugo > Qualcomm Technologies, Inc. is a member of the > Code Aurora Forum, a Linux Foundation Collaborative Project.