Re: [PATCH v3] bus: mhi: Fix race while handling SYS_ERR at power up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey,

On Thu, Nov 18, 2021 at 10:55:51AM +0100, Aleksander Morgado wrote:
> Hey Mani,
> 
> On Thu, Nov 18, 2021 at 6:57 AM Manivannan Sadhasivam
> <manivannan.sadhasivam@xxxxxxxxxx> wrote:
> >
> > Some devices tend to trigger SYS_ERR interrupt while the host handling
> > SYS_ERR state of the device during power up. This creates a race
> > condition and causes a failure in booting up the device.
> >
> > The issue is seen on the Sierra Wireless EM9191 modem during SYS_ERR
> > handling in mhi_async_power_up(). Once the host detects that the device
> > is in SYS_ERR state, it issues MHI_RESET and waits for the device to
> > process the reset request. During this time, the device triggers SYS_ERR
> > interrupt to the host and host starts handling SYS_ERR execution.
> >
> > So by the time the device has completed reset, host starts SYS_ERR
> > handling. This causes the race condition and the modem fails to boot.
> >
> > Hence, register the IRQ handler only after handling the SYS_ERR check
> > to avoid getting spurious IRQs from the device.
> >
> > Cc: stable@xxxxxxxxxxxxxxx
> > Fixes: e18d4e9fa79b ("bus: mhi: core: Handle syserr during power_up")
> > Reported-by: Aleksander Morgado <aleksander@xxxxxxxxxxxxx>
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@xxxxxxxxxx>
> > ---
> >
> > Changes in v3:
> >
> > * Moved BHI_INTVEC setup after irq setup
> > * Used interval_us as the delay for the polling API
> >
> > Changes in v2:
> >
> > * Switched to "mhi_poll_reg_field" for detecting MHI reset in device.
> >
> 
> I tried this v3 patch and I'm not sure if it's working properly in my
> setup; not all boots are successfully bringing the modem up.
> 

Ouch!

> Once I installed it, I kept having this kind of logs on every boot:
> [    7.030407] mhi-pci-generic 0000:01:00.0: BAR 0: assigned [mem
> 0x600000000-0x600000fff 64bit]
> [    7.038984] mhi-pci-generic 0000:01:00.0: enabling device (0000 -> 0002)
> [    7.045814] mhi-pci-generic 0000:01:00.0: using shared MSI
> [    7.052191] mhi mhi0: Requested to power ON
> [    7.168042] mhi mhi0: Power on setup success
> [    7.168141] mhi mhi0: Wait for device to enter SBL or Mission mode
> [   15.687938] mhi-pci-generic 0000:01:00.0: failed to suspend device: -16

[...]

> I didn't try the v1 or v2 patches (sorry!), so not sure if the issues
> come in this last iteration or in an earlier one. Do you want me to
> try with v1 and v2 as well?
> 

Yes, please. Nothing changed other than moving the BHI_INTVEC programming.

Thanks,
Mani

> The patch that was working very reliably (100%) for me was the "bus:
> mhi: Register IRQ handler after SYS_ERR check during power up" one,
> which you attached here:
> https://www.spinics.net/lists/linux-arm-msm/msg97646.html
> 
> -- 
> Aleksander
> https://aleksander.es



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux