Hi Mathieu, On 3/30/20 6:49 PM, Mathieu Poirier wrote: > On Fri, Mar 27, 2020 at 02:04:36PM +0000, Loic PALLARDY wrote: >> >> >>> -----Original Message----- >>> From: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> >>> Sent: mardi 24 mars 2020 22:46 >>> To: bjorn.andersson@xxxxxxxxxx >>> Cc: ohad@xxxxxxxxxx; Loic PALLARDY <loic.pallardy@xxxxxx>; s- >>> anna@xxxxxx; peng.fan@xxxxxxx; Arnaud POULIQUEN >>> <arnaud.pouliquen@xxxxxx>; Fabien DESSENNE >>> <fabien.dessenne@xxxxxx>; linux-remoteproc@xxxxxxxxxxxxxxx >>> Subject: [PATCH v2 16/17] remoteproc: Correctly deal with MCU >>> synchronisation when changing state >>> >>> This patch deals with state changes when synchronising with an MCU. More >>> specifically it prevents the MCU from being started if it already has been >>> started by another entity. Similarly it prevents the AP from stopping the >>> MCU if it hasn't been given the capability by platform firmware. >>> >>> Signed-off-by: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> >>> --- >>> drivers/remoteproc/remoteproc_sysfs.c | 32 >>> ++++++++++++++++++++++++++- >>> 1 file changed, 31 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/remoteproc/remoteproc_sysfs.c >>> b/drivers/remoteproc/remoteproc_sysfs.c >>> index 4956577ad4b4..741a3c152b82 100644 >>> --- a/drivers/remoteproc/remoteproc_sysfs.c >>> +++ b/drivers/remoteproc/remoteproc_sysfs.c >>> @@ -108,6 +108,29 @@ static ssize_t state_show(struct device *dev, struct >>> device_attribute *attr, >>> return sprintf(buf, "%s\n", rproc_state_string[state]); >>> } >>> >>> +static int rproc_can_shutdown(struct rproc *rproc) >>> +{ >>> + /* The MCU is not running, obviously an invalid operation. */ >>> + if (rproc->state != RPROC_RUNNING) >>> + return false; >>> + >>> + /* >>> + * The MCU is not running (see above) and the remoteproc core is >>> the >>> + * lifecycle manager, no problem calling for a shutdown. >>> + */ >>> + if (!rproc_sync_with_mcu(rproc)) >>> + return true; >>> + >>> + /* >>> + * The MCU has been loaded by another entity (see above) and the >>> + * platform code has _not_ given us the capability of stopping it. >>> + */ >>> + if (!rproc->sync_ops->stop) >>> + return false; >> >> Test could be simplified >> if (rproc_sync_with_mcu(rproc)) && !rproc->sync_ops->stop) >> return false; > > I laid out the test individually on purpose. That way there is no coupling > between conditions, it is plane to see what is going on and remains maintainable > as we add new tests. > >> >>> + >>> + return true; >>> +} >>> + >>> /* Change remote processor state via sysfs */ >>> static ssize_t state_store(struct device *dev, >>> struct device_attribute *attr, >>> @@ -120,11 +143,18 @@ static ssize_t state_store(struct device *dev, >>> if (rproc->state == RPROC_RUNNING) >>> return -EBUSY; >>> >>> + /* >>> + * In synchronisation mode, booting the MCU is the >>> + * responsibility of an external entity. >>> + */ >>> + if (rproc_sync_with_mcu(rproc)) >>> + return -EINVAL; >>> + >> I don't understand this restriction, simply because it is preventing to resynchronize with a >> coprocessor after a "stop". There's actually one more scenario even without "stop". If auto_boot is set to false, then rproc_actuate will never get called. >> In the following configuration which can be configuration for coprocessor with romed/flashed >> firmware (no reload needed): >> on_init = true >> after_stop = true >> after_crash = true >> Once you stop it via sysfs interface, you can't anymore restart/resync to it. > > Very true. The MCU will get restarted by another entity but the AP won't > synchronise with it. I need more time to think about the best way to deal with > this and may have to get back to you for further discussions. > >> >> I think it will be better to modify rproc_boot() to take into account rproc_sync_with_mcu() >> as below: >> >> int rproc_boot(struct rproc *rproc) >> { >> - const struct firmware *firmware_p; >> + const struct firmware *firmware_p = NULL; >> struct device *dev = &rproc->dev; >> int ret; >> >> if (!rproc) { >> pr_err("invalid rproc handle\n"); >> return -EINVAL; >> } >> >> /* load firmware */ >> - ret = request_firmware(&firmware_p, rproc->firmware, dev); >> - if (ret < 0) { >> - dev_err(dev, "request_firmware failed: %d\n", ret); >> - return ret; >> + if (!rproc_sync_with_mcu(rproc)) { I guess this is what the original skip_fw_load was doing. And with the current series, the userspace loading support usecase I have cannot be achieved. If this is added back, I can try if that works for my usecases. >> + ret = request_firmware(&firmware_p, rproc->firmware, dev); >> + if (ret < 0) { >> + dev_err(dev, "request_firmware failed: %d\n", ret); >> + return ret; >> + } >> } >> >> ret = rproc_actuate(rproc, firmware_p); >> >> - release_firmware(firmware_p); >> + if (firmware_p) >> + release_firmware(firmware_p); >> >> return ret; >> } >> >> Thanks to these modifications, I'm able to resync after a stop with coprocessor without reloading firmware. >> >>> ret = rproc_boot(rproc); >>> if (ret) >>> dev_err(&rproc->dev, "Boot failed: %d\n", ret); >>> } else if (sysfs_streq(buf, "stop")) { >>> - if (rproc->state != RPROC_RUNNING) >>> + if (!rproc_can_shutdown(rproc)) >>> return -EINVAL; >>> >>> rproc_shutdown(rproc); >> As rproc shutdown is also accessible as kernel API, I propose to move >> rproc_can_shutdown() check inside rproc_shutdown() and to test >> returned error > > Ah yes, it is public... As you point out, I think we'll need to move > rproc_can_shutdown() in rproc_shutdown(). I am assuming only the new conditions, right? regards Suman > > Thank you for taking the time to review this set, > Mathieu >