Re: [PATCH v2 00/14] remoteproc: Add support for detaching from rproc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good day,

On Thu, Nov 12, 2020 at 07:20:49PM +0100, Arnaud POULIQUEN wrote:
> Hi Mathieu,
> 
> 
> On 10/30/20 8:56 PM, Mathieu Poirier wrote:
> > Following the work done here [1], this set provides support for the
> > remoteproc core to release resources associated with a remote processor
> > without having to switch it off. That way a platform driver can be removed
> > or the applcation processor power cycled while the remote processor is
> > still operating.
> 
> I performed some non-regression tests on firmware attachment + few tests on
> detach. I don't see any major problem introduced by this patchset (except the
> minor problem I reported in patch 11/14:remoteproc: correctly process a stop
> request when attached).
> 

Your assessment is correct - I will fix that.

> My concern is that without the bindings, we still have a problem on the recovery.

At this time (and while this feature is being worked on) we have problems with
shutdown and recovery.  The up side is that 1) we know about them and 2) we are
actively working on closing the gab.  As such I am weary of introducting
temporary measures visible to users that will have to be maintained in the
future.  I'd rather simply acknowledge that something is broken.

I will take into account your comments on the bindings (which I haven't looked
at yet).  From there I expect to spin off another revision early to middle of
next week.  If we get this part in for the v5.11 cycle then only crash scenarios
will be left to address.

Thanks for the feedback,
Mathieu

> If a crash occurs while attached to a remote processor, the remote framework is
> in an unexpected state, which requires a system reset to recover it.
> 
> To reproduce the issue, simply generate the crash :
>  cat 1 >/sys/kernel/debug/remoteproc/remoteproc0/crash
> 
> At the end of the mail, I attached a temporary patch to apply on top of this
> series, waiting for the bindings management. The patch shutdowns the attached
> remote processor instead of trying to recover it.
> 
> I wonder if we should fix this for version 4.10 based on the current
> implementation (if the patch window is not closed)...
> Please tell me what would be the best strategy. If it's not too late, I can
> prepare and send a patch tomorrow for v5.10.
> 
> Regards
> Arnaud
> > 
> > The only thing that changes in this revision are the last two patches where
> > device tree bindings to control how to handle attached remote processors have
> > been added.  More specifically two bindings are being proposed:
> > 
> > "autonomous_on_core_reboot": When rproc_cdev_release() or rproc_del() are called
> > and the remote processor has been attached to, it will be detached from (rather
> > than turned off) if "autonomous_on_core_reboot" is specified in the DT.
> > 
> > "autonomous_on_remote_crash": When a remote processor that has been attached to
> > crashes, it will be detached from if "autonomous_on_remote_crash" is specified
> > in the DT. It is _not_ used in this set and presented to show how I intend to 
> > organise things. 
> 
> 
> > 
> > I spent a fair amount of time coming up with the name for the bindings and would
> > welcome other ideas.  I will write a proper yaml file and CC the linux-kernel
> > mailing list once we have an agreement on the naming convention.
> > 
> > Applies cleanly on v5.10-rc1
> > 
> > Thanks,
> > Mathieu
> > 
> > [1]. https://lkml.org/lkml/2020/7/14/1600
> > 
> > Mathieu Poirier (14):
> >   remoteproc: Re-check state in rproc_shutdown()
> >   remoteproc: Remove useless check in rproc_del()
> >   remoteproc: Add new RPROC_ATTACHED state
> >   remoteproc: Properly represent the attached state
> >   remoteproc: Properly deal with a kernel panic when attached
> >   remoteproc: Add new detach() remoteproc operation
> >   remoteproc: Introduce function __rproc_detach()
> >   remoteproc: Introduce function rproc_detach()
> >   remoteproc: Rename function rproc_actuate()
> >   remoteproc: Add return value to function rproc_shutdown()
> >   remoteproc: Properly deal with a stop request when attached
> >   remoteproc: Properly deal with detach request
> >   remoteproc: Add automation flags
> >   remoteproc: Refactor rproc delete and cdev release path
> > 
> >  drivers/remoteproc/remoteproc_cdev.c  |  24 +++-
> >  drivers/remoteproc/remoteproc_core.c  | 183 +++++++++++++++++++++-----
> >  drivers/remoteproc/remoteproc_sysfs.c |  17 ++-
> >  include/linux/remoteproc.h            |  19 ++-
> >  4 files changed, 199 insertions(+), 44 deletions(-)
> > 
> 
> From a04866cf0add0020b65e9ab80d62d44290a1d695 Mon Sep 17 00:00:00 2001
> From: Arnaud Pouliquen <arnaud.pouliquen@xxxxxx>
> Date: Thu, 12 Nov 2020 18:25:52 +0100
> Subject: [PATCH] remoteproc: core fix unexpected state on crash for attached
>  firmware
> 
> The recovery falls in an unexpected state when attached to a remote
> processor.
> As no firmware to load is found, the remote processor has
> just been stopped but associated resources are not free.
> As consequence rproc->power is remaining at 1 and it s no more
> possible to recover the remote processor.
> This patch shutdown the attached remote processor instead of trying
> to recover it.
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@xxxxxx>
> ---
>  drivers/remoteproc/remoteproc_core.c | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index a0611d494758..a38209dd782c 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1739,6 +1739,8 @@ static void rproc_crash_handler_work(struct work_struct *work)
>  {
>  	struct rproc *rproc = container_of(work, struct rproc, crash_handler);
>  	struct device *dev = &rproc->dev;
> +	unsigned int rproc_attached = false;
> +
> 
>  	dev_dbg(dev, "enter %s\n", __func__);
> 
> @@ -1750,15 +1752,21 @@ static void rproc_crash_handler_work(struct work_struct
> *work)
>  		return;
>  	}
> 
> +	if (rproc->state == RPROC_ATTACHED)
> +		rproc_attached = true;
> +
>  	rproc->state = RPROC_CRASHED;
>  	dev_err(dev, "handling crash #%u in %s\n", ++rproc->crash_cnt,
>  		rproc->name);
> 
>  	mutex_unlock(&rproc->lock);
> 
> -	if (!rproc->recovery_disabled)
> -		rproc_trigger_recovery(rproc);
> -
> +	if (!rproc->recovery_disabled) {
> +		if (!rproc_attached)
> +			rproc_trigger_recovery(rproc);
> +		else
> +			rproc_shutdown(rproc);
> +	}
>  	pm_relax(rproc->dev.parent);
>  }
> 
> @@ -1862,7 +1870,8 @@ int rproc_shutdown(struct rproc *rproc)
>  		return ret;
>  	}
> 
> -	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED &&
> +	    rproc->state != RPROC_CRASHED) {
>  		ret = -EPERM;
>  		goto out;
>  	}
> -- 
> 2.17.1
> 
> 
> 



[Index of Archives]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Photo Sharing]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux