On 2021-11-08 09:49 AM, Manivannan Sadhasivam wrote:
Some devices tend to trigger SYS_ERR interrupt while the host handling
SYS_ERR state of the device during power up. This creates a race
condition and causes a failure in booting up the device.
The issue is seen on the Sierra Wireless EM9191 modem during SYS_ERR
handling in mhi_async_power_up(). Once the host detects that the device
is in SYS_ERR state, it issues MHI_RESET and waits for the device to
process the reset request. During this time, the device triggers
SYS_ERR
interrupt to the host and host starts handling SYS_ERR execution.
So by the time the device has completed reset, host starts SYS_ERR
handling. This causes the race condition and the modem fails to boot.
Hence, register the IRQ handler only after handling the SYS_ERR check
to avoid getting spurious IRQs from the device.
Cc: stable@xxxxxxxxxxxxxxx
Fixes: e18d4e9fa79b ("bus: mhi: core: Handle syserr during power_up")
Reported-by: Aleksander Morgado <aleksander@xxxxxxxxxxxxx>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
---
Changes in v2:
* Switched to "mhi_poll_reg_field" for detecting MHI reset in device.
drivers/bus/mhi/core/pm.c | 32 ++++++++++----------------------
1 file changed, 10 insertions(+), 22 deletions(-)
diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c
index fb99e3727155..3c347fe9b10d 100644
--- a/drivers/bus/mhi/core/pm.c
+++ b/drivers/bus/mhi/core/pm.c
@@ -1038,7 +1038,6 @@ int mhi_async_power_up(struct mhi_controller
*mhi_cntrl)
enum mhi_ee_type current_ee;
enum dev_st_transition next_state;
struct device *dev = &mhi_cntrl->mhi_dev->dev;
- u32 val;
u32 interval_us = 25000; /* poll register field every 25 milliseconds */
int ret;
dev_info(dev, "Requested to power ON\n");
@@ -1055,10 +1054,6 @@ int mhi_async_power_up(struct mhi_controller
*mhi_cntrl)
mutex_lock(&mhi_cntrl->pm_mutex);
mhi_cntrl->pm_state = MHI_PM_DISABLE;
- ret = mhi_init_irq_setup(mhi_cntrl);
- if (ret)
- goto error_setup_irq;
-
/* Setup BHI INTVEC */
write_lock_irq(&mhi_cntrl->pm_lock);
mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
Let's also set the BHI INTVEC up _after_ the IRQ setup is done. I
believe doing
this before setting up IRQs may be racy.
@@ -1072,7 +1067,7 @@ int mhi_async_power_up(struct mhi_controller
*mhi_cntrl)
dev_err(dev, "%s is not a valid EE for power on\n",
TO_MHI_EXEC_STR(current_ee));
ret = -EIO;
- goto error_async_power_up;
+ goto error_setup_irq;
}
state = mhi_get_mhi_state(mhi_cntrl);
@@ -1081,20 +1076,12 @@ int mhi_async_power_up(struct mhi_controller
*mhi_cntrl)
if (state == MHI_STATE_SYS_ERR) {
mhi_set_mhi_state(mhi_cntrl, MHI_STATE_RESET);
- ret = wait_event_timeout(mhi_cntrl->state_event,
- MHI_PM_IN_FATAL_STATE(mhi_cntrl->pm_state) ||
- mhi_read_reg_field(mhi_cntrl,
- mhi_cntrl->regs,
- MHICTRL,
- MHICTRL_RESET_MASK,
- MHICTRL_RESET_SHIFT,
- &val) ||
- !val,
- msecs_to_jiffies(mhi_cntrl->timeout_ms));
- if (!ret) {
- ret = -EIO;
+ ret = mhi_poll_reg_field(mhi_cntrl, mhi_cntrl->regs, MHICTRL,
+ MHICTRL_RESET_MASK, MHICTRL_RESET_SHIFT, 0,
+ msecs_to_jiffies(mhi_cntrl->timeout_ms));
Use "interval_us" as the last argument here since this API takes the
polling interval
value in microseconds as the last argument.
+ if (ret) {
dev_info(dev, "Failed to reset MHI due to syserr state\n");
- goto error_async_power_up;
+ goto error_setup_irq;
}
/*
@@ -1104,6 +1091,10 @@ int mhi_async_power_up(struct mhi_controller
*mhi_cntrl)
mhi_write_reg(mhi_cntrl, mhi_cntrl->bhi, BHI_INTVEC, 0);
}
+ ret = mhi_init_irq_setup(mhi_cntrl);
+ if (ret)
+ goto error_setup_irq;
+
/* Transition to next state */
next_state = MHI_IN_PBL(current_ee) ?
DEV_ST_TRANSITION_PBL : DEV_ST_TRANSITION_READY;
@@ -1116,9 +1107,6 @@ int mhi_async_power_up(struct mhi_controller
*mhi_cntrl)
return 0;
-error_async_power_up:
- mhi_deinit_free_irq(mhi_cntrl);
-
error_setup_irq:
mhi_cntrl->pm_state = MHI_PM_DISABLE;
mutex_unlock(&mhi_cntrl->pm_mutex);
Thanks,
Bhaumik
---
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum,
a Linux Foundation Collaborative Project