RE: [PATCH V2 2/2] remoteproc: support attach recovery after rproc crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Subject: Re: [PATCH V2 2/2] remoteproc: support attach recovery after rproc
> crash
> 
> On Tue 08 Mar 00:48 CST 2022, Peng Fan (OSS) wrote:
> 
> > From: Peng Fan <peng.fan@xxxxxxx>
> >
> > Current logic only support main processor to stop/start the remote
> > processor after rproc crash. However to SoC, such as i.MX8QM/QXP, the
> > remote processor could do attach recovery after crash and trigger
> > watchdog
> 
> Does it really do something called "attach recovery and trigger watchdog
> reboot"? Doesn't it just reboot itself and Linux needs to detach and reattach
> to get something (what?) reset?

I mean the remote processor could re-run without linux to load firmware/stop/
start. Linux side needs to detach/attach to communicate with remote processor.

> 
> > reboot. It does not need main processor to load image, or stop/start
> > M4 core.
> >
> > Introduce two functions: rproc_attach_recovery,
> > rproc_firmware_recovery for the two cases. Firmware recovery is as
> > before, let main processor to help recovery, while attach recovery is recover
> itself withou help.
> > To attach recovery, we only do detach and attach.
> >
> > Signed-off-by: Peng Fan <peng.fan@xxxxxxx>
> > ---
> >
> > V2:
> >  use rproc_has_feature in patch 1/2
> >
> >  drivers/remoteproc/remoteproc_core.c | 67
> > ++++++++++++++++++++--------
> >  1 file changed, 48 insertions(+), 19 deletions(-)
> >
> > diff --git a/drivers/remoteproc/remoteproc_core.c
> > b/drivers/remoteproc/remoteproc_core.c
> > index 69f51acf235e..366fad475898 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1887,6 +1887,50 @@ static int __rproc_detach(struct rproc *rproc)
> >  	return 0;
> >  }
> >
> > +static int rproc_attach_recovery(struct rproc *rproc) {
> > +	int ret;
> > +
> > +	mutex_unlock(&rproc->lock);
> > +	ret = rproc_detach(rproc);
> > +	mutex_lock(&rproc->lock);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (atomic_inc_return(&rproc->power) > 1)
> 
> In the stop/coredump/start path the code _will_ attempt to recover the
> remote processor. With rproc_detach() and rproc_attach() fiddling with the
> rproc->power refcount this might do something, or it might not do something.
> And with the mutex_unlock() it's likely that you're opening of up for various
> race conditions inbetween.

Rproc_boot will inc rproc->power.
Rproc_detach will decrease rproc->power
Rproc_attach not touch rproc->power.

When do attach recovery, the logic is detach->attach. So I add one
inc rproc->power check to avoid count mis-usage.

> 
> 
> PS. Does anyone actually use this refcount, or are we just all holding our
> breath for it never going beyond 1?

I think latter usage.

Thanks,
Peng.

> 
> Regards,
> Bjorn
> 
> > +		return 0;
> > +
> > +	return rproc_attach(rproc);
> > +}
> > +
> > +static int rproc_firmware_recovery(struct rproc *rproc) {
> > +	const struct firmware *firmware_p;
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	ret = rproc_stop(rproc, true);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* generate coredump */
> > +	rproc->ops->coredump(rproc);
> > +
> > +	/* load firmware */
> > +	ret = request_firmware(&firmware_p, rproc->firmware, dev);
> > +	if (ret < 0) {
> > +		dev_err(dev, "request_firmware failed: %d\n", ret);
> > +		return ret;
> > +	}
> > +
> > +	/* boot the remote processor up again */
> > +	ret = rproc_start(rproc, firmware_p);
> > +
> > +	release_firmware(firmware_p);
> > +
> > +	return ret;
> > +}
> > +
> >  /**
> >   * rproc_trigger_recovery() - recover a remoteproc
> >   * @rproc: the remote processor
> > @@ -1901,7 +1945,6 @@ static int __rproc_detach(struct rproc *rproc)
> >   */
> >  int rproc_trigger_recovery(struct rproc *rproc)  {
> > -	const struct firmware *firmware_p;
> >  	struct device *dev = &rproc->dev;
> >  	int ret;
> >
> > @@ -1915,24 +1958,10 @@ int rproc_trigger_recovery(struct rproc
> > *rproc)
> >
> >  	dev_err(dev, "recovering %s\n", rproc->name);
> >
> > -	ret = rproc_stop(rproc, true);
> > -	if (ret)
> > -		goto unlock_mutex;
> > -
> > -	/* generate coredump */
> > -	rproc->ops->coredump(rproc);
> > -
> > -	/* load firmware */
> > -	ret = request_firmware(&firmware_p, rproc->firmware, dev);
> > -	if (ret < 0) {
> > -		dev_err(dev, "request_firmware failed: %d\n", ret);
> > -		goto unlock_mutex;
> > -	}
> > -
> > -	/* boot the remote processor up again */
> > -	ret = rproc_start(rproc, firmware_p);
> > -
> > -	release_firmware(firmware_p);
> > +	if (rproc_has_feature(rproc, RPROC_FEAT_ATTACH_RECOVERY))
> > +		ret = rproc_attach_recovery(rproc);
> > +	else
> > +		ret = rproc_firmware_recovery(rproc);
> >
> >  unlock_mutex:
> >  	mutex_unlock(&rproc->lock);
> > --
> > 2.30.0
> >




[Index of Archives]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Photo Sharing]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux