Hey Loic, On Mon, Nov 08, 2021 at 07:04:56PM +0100, Loic Poulain wrote: > Hi Mani, > > On Mon, 8 Nov 2021 at 18:42, Manivannan Sadhasivam > <manivannan.sadhasivam@xxxxxxxxxx> wrote: > > > > Some devices tend to trigger SYS_ERR interrupt while the host handling > > SYS_ERR state of the device during power up. This creates a race > > condition and causes a failure in booting up the device. > > > > The issue is seen on the Sierra Wireless EM9191 modem during SYS_ERR > > handling in mhi_async_power_up(). Once the host detects that the device > > is in SYS_ERR state, it issues MHI_RESET and waits for the device to > > process the reset request. During this time, the device triggers SYS_ERR > > interrupt to the host and host starts handling SYS_ERR execution. > > > > So by the time the device has completed reset, host starts SYS_ERR > > handling. This causes the race condition and the modem fails to boot. > > > > Hence, register the IRQ handler only after handling the SYS_ERR check > > to avoid getting spurious IRQs from the device. > > > > Cc: stable@xxxxxxxxxxxxxxx > > Fixes: e18d4e9fa79b ("bus: mhi: core: Handle syserr during power_up") > > Reported-by: Aleksander Morgado <aleksander@xxxxxxxxxxxxx> > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx> > > --- > > drivers/bus/mhi/core/pm.c | 26 +++++++++++--------------- > > 1 file changed, 11 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c > > index fb99e3727155..ec5f11166820 100644 > > --- a/drivers/bus/mhi/core/pm.c > > +++ b/drivers/bus/mhi/core/pm.c > > @@ -1055,10 +1055,6 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl) > > mutex_lock(&mhi_cntrl->pm_mutex); > > mhi_cntrl->pm_state = MHI_PM_DISABLE; > > > > - ret = mhi_init_irq_setup(mhi_cntrl); > > - if (ret) > > - goto error_setup_irq; > > - > > /* Setup BHI INTVEC */ > > write_lock_irq(&mhi_cntrl->pm_lock); > > mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0); > > @@ -1072,7 +1068,7 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl) > > dev_err(dev, "%s is not a valid EE for power on\n", > > TO_MHI_EXEC_STR(current_ee)); > > ret = -EIO; > > - goto error_async_power_up; > > + goto error_setup_irq; > > } > > > > state = mhi_get_mhi_state(mhi_cntrl); > > @@ -1082,19 +1078,18 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl) > > if (state == MHI_STATE_SYS_ERR) { > > mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET); > > ret = wait_event_timeout(mhi_cntrl->state_event, > > Shouldn't we use a polling variant such as mhi_poll_reg_field() given > the interrupts are not yet enabled? > Realised _just_ after sending the patch and already submitted v2. Please take a look. Thanks, Mani > > - MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) || > > - mhi_read_reg_field(mhi_cntrl, > > - mhi_cntrl->regs, > > - MHICTRL, > > - MHICTRL_RESET_MASK, > > - MHICTRL_RESET_SHIFT, > > + mhi_read_reg_field(mhi_cntrl, > > + mhi_cntrl->regs, > > + MHICTRL, > > + MHICTRL_RESET_MASK, > > + MHICTRL_RESET_SHIFT, > > &val) || > > !val, > > msecs_to_jiffies(mhi_cntrl->timeout_ms)); > > if (!ret) { > > ret = -EIO; > > dev_info(dev, "Failed to reset MHI due to syserr state\n"); > > - goto error_async_power_up; > > + goto error_setup_irq; > > } > > Regards, > Loic