On 9/6/2024 3:10 PM, Beleswar Padhi wrote:
The current implementation of the waiting mechanism in probe() waits for
the 'released_from_reset' flag to be set which is done in
k3_r5_rproc_prepare() as part of rproc_fw_boot(). This causes unexpected
failures in cases where the firmware is unavailable at boot time,
resulting in probe failure and removal of the remoteproc handles in the
sysfs paths.
I won't say failure, I will say this is behavior of driver.
Driver expect firmware to be available , but I agree driver should be
able to execute
with/without firmware availability.
To address this, the waiting mechanism is refactored out of the probe
routine into the appropriate k3_r5_rproc_prepare/unprepare() and
k3_r5_rproc_start/stop() functions. This allows the probe routine to
complete without depending on firmware booting, while still maintaining
the required power-synchronization between cores.
Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1")
Signed-off-by: Beleswar Padhi <b-padhi@xxxxxx>
---
Posted this as a Fix as this was breaking usecases where we wanted to load a
firmware by writing to sysfs handles in userspace.
drivers/remoteproc/ti_k3_r5_remoteproc.c | 170 ++++++++++++++++-------
1 file changed, 118 insertions(+), 52 deletions(-)
diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c b/drivers/remoteproc/ti_k3_r5_remoteproc.c
[..]
+ core0 = list_first_entry(&cluster->cores, struct k3_r5_core, elem);
+ core1 = list_last_entry(&cluster->cores, struct k3_r5_core, elem);
+ if (cluster->mode == CLUSTER_MODE_SPLIT && core == core1 &&
+ core0->released_from_reset == false) {
+ ret = wait_event_interruptible_timeout(cluster->core_transition,
+ core0->released_from_reset,
+ msecs_to_jiffies(2000));
only one wait in start should be good enough,
+ if (ret <= 0) {
+ dev_err(dev, "can not power up core1 before core0");
+ return -EPERM;
+ }
+ }
+
ret = ti_sci_proc_get_status(core->tsp, &boot_vec, &cfg, &ctrl, &stat);
if (ret < 0)
return ret;
@@ -470,6 +492,12 @@ static int k3_r5_rproc_prepare(struct rproc *rproc)
return ret;
}
+ /* Notify all threads in the wait queue when core state has changed so
+ * that threads waiting for this condition can be executed.
+ */
+ core->released_from_reset = true;
+ wake_up_interruptible(&cluster->core_transition);
+
/*
* Newer IP revisions like on J7200 SoCs support h/w auto-initialization
* of TCMs, so there is no need to perform the s/w memzero. This bit is
@@ -515,14 +543,46 @@ static int k3_r5_rproc_unprepare(struct rproc *rproc)
{
struct k3_r5_rproc *kproc = rproc->priv;
struct k3_r5_cluster *cluster = kproc->cluster;
- struct k3_r5_core *core = kproc->core;
+ struct k3_r5_core *core0, *core1, *core = kproc->core;
why you need wait in unprepare/stop,
In case you are failing during firmware load or so then already we are
in bad state.
if this is call from user land then, i don't except anyone will auto
trigger stopping of core-0
IMO, in unprepare/stop, if action is attempted on core1 with core-0 ON,
simply return error.
[..]
@@ -629,7 +702,7 @@ static int k3_r5_rproc_stop(struct rproc *rproc)
struct k3_r5_rproc *kproc = rproc->priv;
struct k3_r5_cluster *cluster = kproc->cluster;
struct device *dev = kproc->dev;