On Thu, Oct 13, 2022 at 09:40:09AM +0800, Aiqun(Maria) Yu wrote: > Hi Mathieu, > > On 10/13/2022 4:43 AM, Mathieu Poirier wrote: > > Please add what has changed from one version to another, either in a cover > > letter or after the "Signed-off-by". There are many examples on how to do that > > on the mailing list. > > > Thx for the information, will take a note and benefit for next time. > > > On Fri, Sep 16, 2022 at 03:12:31PM +0800, Maria Yu wrote: > > > RPROC_OFFLINE state indicate there is no recovery process > > > is in progress and no chance to do the pm_relax. > > > Because when recovering from crash, rproc->lock is held and > > > state is RPROC_CRASHED -> RPROC_OFFLINE -> RPROC_RUNNING, > > > and then unlock rproc->lock. > > > > You are correct - because the lock is held rproc->state should be set to RPROC_RUNNING > > when rproc_trigger_recovery() returns. If that is not the case then something > > went wrong. > > > > Function rproc_stop() sets rproc->state to RPROC_OFFLINE just before returning, > > so we know the remote processor was stopped. Therefore if rproc->state is set > > to RPROC_OFFLINE something went wrong in either request_firmware() or > > rproc_start(). Either way the remote processor is offline and the system probably > > in an unknown/unstable. As such I don't see how calling pm_relax() can help > > things along. > > > PROC_OFFLINE is possible that rproc_shutdown is triggered and successfully > finished. > Even if it is multi crash rproc_crash_handler_work contention issue, and > last rproc_trigger_recovery bailed out with only > rproc->state==RPROC_OFFLINE, it is still worth to do pm_relax in pair. > Since the subsystem may still can be recovered with customer's next trigger > of rproc_start, and we can make each error out path clean with pm resources. > > > I suggest spending time understanding what leads to the failure when recovering > > from a crash and address that problem(s). > > > In current case, the customer's information is that the issue happened when > rproc_shutdown is triggered at similar time. So not an issue from error out > of rproc_trigger_recovery. That is a very important element to consider and should have been mentioned from the beginning. What I see happening is the following: rproc_report_crash() pm_stay_awake() queue_work() // current thread is suspended rproc_shutdown() rproc_stop() rproc->state = RPROC_OFFLINE; rproc_crash_handler_work() if (rproc->state == RPROC_OFFLINE) return // pm_relax() is not called The right way to fix this is to add a pm_relax() in rproc_shutdown() and rproc_detach(), along with a very descriptive comment as to why it is needed. > > Thanks, > > Mathieu > > > > > > > When the state is in RPROC_OFFLINE it means separate request > > > of rproc_stop was done and no need to hold the wakeup source > > > in crash handler to recover any more. > > > > > > Signed-off-by: Maria Yu <quic_aiquny@xxxxxxxxxxx> > > > --- > > > drivers/remoteproc/remoteproc_core.c | 11 +++++++++++ > > > 1 file changed, 11 insertions(+) > > > > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c > > > index e5279ed9a8d7..6bc7b8b7d01e 100644 > > > --- a/drivers/remoteproc/remoteproc_core.c > > > +++ b/drivers/remoteproc/remoteproc_core.c > > > @@ -1956,6 +1956,17 @@ static void rproc_crash_handler_work(struct work_struct *work) > > > if (rproc->state == RPROC_CRASHED || rproc->state == RPROC_OFFLINE) { > > > /* handle only the first crash detected */ > > > mutex_unlock(&rproc->lock); > > > + /* > > > + * RPROC_OFFLINE state indicate there is no recovery process > > > + * is in progress and no chance to have pm_relax in place. > > > + * Because when recovering from crash, rproc->lock is held and > > > + * state is RPROC_CRASHED -> RPROC_OFFLINE -> RPROC_RUNNING, > > > + * and then unlock rproc->lock. > > > + * RPROC_OFFLINE is only an intermediate state in recovery > > > + * process. > > > + */ > > > + if (rproc->state == RPROC_OFFLINE) > > > + pm_relax(rproc->dev.parent); > > > return; > > > } > > > -- > > > 2.7.4 > > > > > > -- > Thx and BRs, > Aiqun(Maria) Yu