On Fri, 5 Mar 2021 at 16:34, Loic Poulain <loic.poulain@xxxxxxxxxx> wrote: > > On Fri, 5 Mar 2021 at 16:09, Jeffrey Hugo <jhugo@xxxxxxxxxxxxxx> wrote: > > > > On 3/5/2021 8:08 AM, Loic Poulain wrote: > > > Hi Jeffrey, > > > > > > On Fri, 5 Mar 2021 at 15:49, Jeffrey Hugo <jhugo@xxxxxxxxxxxxxx> wrote: > > >> > > >> On 3/5/2021 7:09 AM, Loic Poulain wrote: > > >>> mhi_cntrl->timeout_ms is set by the controller and indicates the > > >>> maximum amount of time the controller device will take to be ready. > > >>> In case of PCI modems, this value is quite high given modems can take > > >>> up to 15 seconds from cold boot to be ready. > > >>> > > >>> Reusing this value in mhi_pm_resume can cause huge resuming latency > > >>> and delay the whole system resume (in case of system wide suspend/ > > >>> resume), leading to bad use experience. > > >> > > >> I think this needs more explanation. The timeout is a maximum value. > > >> You indicate that 2 seconds is more than enough for any MHI device to > > >> exit M3 (citation needed), but 15 seconds is too much? The difference > > >> should only be apparent when the device doesn't transition in the timeout. > > >> > > >> Put another way, this doesn't say why 15 seconds is bad, if every device > > >> only needs 2, given that wait_event_timeout() doesn't always wait for > > >> the entire timeout value if the event occurs earlier. > > > > > > Yes, right that deserves an explanation: depending on the platform and > > > the suspend type (deep, s2idle), the PCI device may or may not lose > > > power. In case power is maintained, there is no problem and the > > > controller is successfully moved to M0. But in case of power loss, the > > > device is going to restart, and MHI resuming is going to timeout and > > > fail since M0 will never be reached. On PCI side we simply > > > reinitialize the controller in case of resume failure. So in other > > > words, MHI resume is expected to fail in some cases and it should be > > > handled with minimal impact on the system. > > > > Can we detect the power loss in far less than 2 seconds, and abort the > > resume process? Waiting for the entire timeout, regardless of the > > value, in the power loss scenario you describe seems less than ideal for > > the system impact you are attempting to optimize. > > That's a good question, like checking the state is M3 before trying > anything, need to check that. Ok, please discard this patch, I've submitted another change that takes care of this more properly. Thanks, Jeffrey for challenging this. Loic