On 4/2/20 3:42 PM, Mathieu Poirier wrote: > Hi Suman, > > On Tue, Mar 31, 2020 at 05:35:58PM -0500, Suman Anna wrote: >> Hi Mathieu, >> >> On 3/30/20 6:49 PM, Mathieu Poirier wrote: >>> On Fri, Mar 27, 2020 at 02:04:36PM +0000, Loic PALLARDY wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> >>>>> Sent: mardi 24 mars 2020 22:46 >>>>> To: bjorn.andersson@xxxxxxxxxx >>>>> Cc: ohad@xxxxxxxxxx; Loic PALLARDY <loic.pallardy@xxxxxx>; s- >>>>> anna@xxxxxx; peng.fan@xxxxxxx; Arnaud POULIQUEN >>>>> <arnaud.pouliquen@xxxxxx>; Fabien DESSENNE >>>>> <fabien.dessenne@xxxxxx>; linux-remoteproc@xxxxxxxxxxxxxxx >>>>> Subject: [PATCH v2 16/17] remoteproc: Correctly deal with MCU >>>>> synchronisation when changing state >>>>> >>>>> This patch deals with state changes when synchronising with an MCU. More >>>>> specifically it prevents the MCU from being started if it already has been >>>>> started by another entity. Similarly it prevents the AP from stopping the >>>>> MCU if it hasn't been given the capability by platform firmware. >>>>> >>>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> >>>>> --- >>>>> drivers/remoteproc/remoteproc_sysfs.c | 32 >>>>> ++++++++++++++++++++++++++- >>>>> 1 file changed, 31 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/remoteproc/remoteproc_sysfs.c >>>>> b/drivers/remoteproc/remoteproc_sysfs.c >>>>> index 4956577ad4b4..741a3c152b82 100644 >>>>> --- a/drivers/remoteproc/remoteproc_sysfs.c >>>>> +++ b/drivers/remoteproc/remoteproc_sysfs.c >>>>> @@ -108,6 +108,29 @@ static ssize_t state_show(struct device *dev, struct >>>>> device_attribute *attr, >>>>> return sprintf(buf, "%s\n", rproc_state_string[state]); >>>>> } >>>>> >>>>> +static int rproc_can_shutdown(struct rproc *rproc) >>>>> +{ >>>>> + /* The MCU is not running, obviously an invalid operation. */ >>>>> + if (rproc->state != RPROC_RUNNING) >>>>> + return false; >>>>> + >>>>> + /* >>>>> + * The MCU is not running (see above) and the remoteproc core is >>>>> the >>>>> + * lifecycle manager, no problem calling for a shutdown. >>>>> + */ >>>>> + if (!rproc_sync_with_mcu(rproc)) >>>>> + return true; >>>>> + >>>>> + /* >>>>> + * The MCU has been loaded by another entity (see above) and the >>>>> + * platform code has _not_ given us the capability of stopping it. >>>>> + */ >>>>> + if (!rproc->sync_ops->stop) >>>>> + return false; >>>> >>>> Test could be simplified >>>> if (rproc_sync_with_mcu(rproc)) && !rproc->sync_ops->stop) >>>> return false; >>> >>> I laid out the test individually on purpose. That way there is no coupling >>> between conditions, it is plane to see what is going on and remains maintainable >>> as we add new tests. >>> >>>> >>>>> + >>>>> + return true; >>>>> +} >>>>> + >>>>> /* Change remote processor state via sysfs */ >>>>> static ssize_t state_store(struct device *dev, >>>>> struct device_attribute *attr, >>>>> @@ -120,11 +143,18 @@ static ssize_t state_store(struct device *dev, >>>>> if (rproc->state == RPROC_RUNNING) >>>>> return -EBUSY; >>>>> >>>>> + /* >>>>> + * In synchronisation mode, booting the MCU is the >>>>> + * responsibility of an external entity. >>>>> + */ >>>>> + if (rproc_sync_with_mcu(rproc)) >>>>> + return -EINVAL; >>>>> + >>>> I don't understand this restriction, simply because it is preventing to resynchronize with a >>>> coprocessor after a "stop". >> >> There's actually one more scenario even without "stop". If auto_boot is >> set to false, then rproc_actuate will never get called. >> >>>> In the following configuration which can be configuration for coprocessor with romed/flashed >>>> firmware (no reload needed): >>>> on_init = true >>>> after_stop = true >>>> after_crash = true >>>> Once you stop it via sysfs interface, you can't anymore restart/resync to it. >>> >>> Very true. The MCU will get restarted by another entity but the AP won't >>> synchronise with it. I need more time to think about the best way to deal with >>> this and may have to get back to you for further discussions. >>> >>>> >>>> I think it will be better to modify rproc_boot() to take into account rproc_sync_with_mcu() >>>> as below: >>>> >>>> int rproc_boot(struct rproc *rproc) >>>> { >>>> - const struct firmware *firmware_p; >>>> + const struct firmware *firmware_p = NULL; >>>> struct device *dev = &rproc->dev; >>>> int ret; >>>> >>>> if (!rproc) { >>>> pr_err("invalid rproc handle\n"); >>>> return -EINVAL; >>>> } >>>> >>>> /* load firmware */ >>>> - ret = request_firmware(&firmware_p, rproc->firmware, dev); >>>> - if (ret < 0) { >>>> - dev_err(dev, "request_firmware failed: %d\n", ret); >>>> - return ret; >>>> + if (!rproc_sync_with_mcu(rproc)) { >> >> I guess this is what the original skip_fw_load was doing. And with the >> current series, the userspace loading support usecase I have cannot be >> achieved. If this is added back, I can try if that works for my usecases. > > I didn't notice this comment upon first read... Can you give me more details on > what your usecase is order to see how best to deal with it? We have a userspace loader usecase, where we use a daemon/server that does the loading, and talks to the keystone remoteproc driver over a character driver to publish the resource table and boot vectors, and to start/stop the processor. You can find more details on the downstream commit [1] we have. We exercise the regular kernel rproc state-machine and suppress the firmware segment loading portion of it. regards Suman [1] https://git.ti.com/gitweb?p=rpmsg/remoteproc.git;a=commit;h=c41f591ccaaeef66bd4ccbb883ae72f6b0d173d7 > > Thanks, > Mathieu > >> >>>> + ret = request_firmware(&firmware_p, rproc->firmware, dev); >>>> + if (ret < 0) { >>>> + dev_err(dev, "request_firmware failed: %d\n", ret); >>>> + return ret; >>>> + } >>>> } >>>> >>>> ret = rproc_actuate(rproc, firmware_p); >>>> >>>> - release_firmware(firmware_p); >>>> + if (firmware_p) >>>> + release_firmware(firmware_p); >>>> >>>> return ret; >>>> } >>>> >>>> Thanks to these modifications, I'm able to resync after a stop with coprocessor without reloading firmware. >>>> >>>>> ret = rproc_boot(rproc); >>>>> if (ret) >>>>> dev_err(&rproc->dev, "Boot failed: %d\n", ret); >>>>> } else if (sysfs_streq(buf, "stop")) { >>>>> - if (rproc->state != RPROC_RUNNING) >>>>> + if (!rproc_can_shutdown(rproc)) >>>>> return -EINVAL; >>>>> >>>>> rproc_shutdown(rproc); >>>> As rproc shutdown is also accessible as kernel API, I propose to move >>>> rproc_can_shutdown() check inside rproc_shutdown() and to test >>>> returned error >>> >>> Ah yes, it is public... As you point out, I think we'll need to move >>> rproc_can_shutdown() in rproc_shutdown(). >> >> I am assuming only the new conditions, right? >> >> regards >> Suman >> >>> >>> Thank you for taking the time to review this set, >>> Mathieu >>>