Hi, Ulf, On 29.04.2024 17:19, Ulf Hansson wrote: > On Wed, 24 Apr 2024 at 13:14, claudiu beznea <claudiu.beznea@xxxxxxxxx> wrote: >> >> Hi, Ulf, >> >> On 12.04.2024 17:02, claudiu beznea wrote: >>> Hi, Ulf, >>> >>> On 12.04.2024 14:14, Ulf Hansson wrote: >>>> On Wed, 10 Apr 2024 at 16:19, Claudiu <claudiu.beznea@xxxxxxxxx> wrote: >>>>> >>>>> From: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx> >>>>> >>>>> The rzg2l_wdt_restart() is called from atomic context. Calling >>>>> pm_runtime_{get_sync, resume_and_get}() or any other runtime PM resume >>>>> APIs is not an option as it may lead to issues as described in commit >>>>> e4cf89596c1f ("watchdog: rzg2l_wdt: Fix 'BUG: Invalid wait context'") >>>>> that removed the pm_runtime_get_sync() and used directly the >>>>> clk_prepare_enable() APIs. >>>>> >>>>> Starting with RZ/G3S the watchdog could be part of its own software >>>>> controlled power domain (see the initial implementation in Link section). >>>>> In case the watchdog is not used the power domain is off and accessing >>>>> watchdog registers leads to aborts. >>>>> >>>>> To solve this the patch powers on the power domain using >>>>> dev_pm_genpd_resume() API before enabling its clock. This is not >>>>> sleeping or taking any other locks as the power domain will not be >>>>> registered with GENPD_FLAG_IRQ_SAFE flags. >>>>> >>>>> Link: https://lore.kernel.org/all/20240208124300.2740313-1-claudiu.beznea.uj@xxxxxxxxxxxxxx >>>>> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@xxxxxxxxxxxxxx> >>>>> --- >>>>> >>>>> Changes in v8: >>>>> - none, this patch is new >>>>> >>>>> drivers/watchdog/rzg2l_wdt.c | 12 ++++++++++++ >>>>> 1 file changed, 12 insertions(+) >>>>> >>>>> diff --git a/drivers/watchdog/rzg2l_wdt.c b/drivers/watchdog/rzg2l_wdt.c >>>>> index c8c20cfb97a3..98e5e9914a5d 100644 >>>>> --- a/drivers/watchdog/rzg2l_wdt.c >>>>> +++ b/drivers/watchdog/rzg2l_wdt.c >>>>> @@ -12,6 +12,7 @@ >>>>> #include <linux/module.h> >>>>> #include <linux/of.h> >>>>> #include <linux/platform_device.h> >>>>> +#include <linux/pm_domain.h> >>>>> #include <linux/pm_runtime.h> >>>>> #include <linux/reset.h> >>>>> #include <linux/units.h> >>>>> @@ -164,6 +165,17 @@ static int rzg2l_wdt_restart(struct watchdog_device *wdev, >>>>> struct rzg2l_wdt_priv *priv = watchdog_get_drvdata(wdev); >>>>> int ret; >>>>> >>>>> + /* >>>>> + * The device may be part of a power domain that is currently >>>>> + * powered off. We need to power it up before accessing registers. >>>>> + * We don't undo the dev_pm_genpd_resume() as the device need to >>>>> + * be up for the reboot to happen. Also, as we are in atomic context >>>>> + * here there is no need to increment PM runtime usage counter >>>>> + * (to make sure pm_runtime_active() doesn't return wrong code). >>>>> + */ >>>>> + if (!pm_runtime_active(wdev->parent)) >>>>> + dev_pm_genpd_resume(wdev->parent); >>>>> + >>>> >>>> I doubt this is the correct solution, but I may be wrong. Unless this >>>> is invoked at the syscore stage? >>> >>> On my case I see it invoked from kernel_restart(). As of my code reading, >> >> With the above explanations, do you consider calling dev_pm_genpd_resume() >> here is still wrong? > > Yes. At least, those genpd functions were not added to cope for cases like this. Sorry to bother you, do you have some suggestions on this topic? On my side I did some investigation to see how else it could be implemented but I don't have much clue how to go forward. Would you prefer to have a separate API to deal with domain power on in this scenario? Maybe one that should run only in the reboot context? Would you consider only updating the description of dev_pm_genpd_resume() and genpd_sync_power_on() to specify that it could run in a reboot context? Would you consider updating the genpd_switch_state() to take into system reboot state and do locking based on that, too? > > Moreover, you still need to find another solution as > clk_prepare_enable() can't be called in this path. The clock driver doesn't implement clk_ops::prepare in all micro-architectures that this watchdog driver is used. This may be the reason the clk_prepare_enable() was used on this path from the beginning. Even though, a simple solution I have in mind for this is to keep the clk prepared all the time. Thank you, Claudiu Beznea > >> >> Do you have any suggestions I could try? > > Not at the moment, but I will try to circle back to this topic more > thinking next week, when I have some more time. > >> >> Thank you, >> Claudiu Beznea > > Kind regards > Uffe > >> >>> at that point only one CPU is active with IRQs disabled (done in >>> machine_restart()). Below is the stack trace decoded on next-20240410 with >>> this series >>> (https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/) >>> on top and the one from here (adding power domain support): >>> https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/ >>> >>> Hardware name: Renesas SMARC EVK version 2 based on r9a08g045s33 (DT) >>> Call trace: >>> dump_backtrace (arch/arm64/kernel/stacktrace.c:319) >>> show_stack (arch/arm64/kernel/stacktrace.c:326) >>> dump_stack_lvl (lib/dump_stack.c:117) >>> dump_stack (lib/dump_stack.c:124) >>> rzg2l_wdt_restart (drivers/watchdog/rzg2l_wdt.c:180) >>> watchdog_restart_notifier (drivers/watchdog/watchdog_core.c:188) >>> atomic_notifier_call_chain (kernel/notifier.c:98 kernel/notifier.c:231) >>> do_kernel_restart (kernel/reboot.c:236) >>> machine_restart (arch/arm64/kernel/process.c:145) >>> kernel_restart (kernel/reboot.c:287) >>> __do_sys_reboot (kernel/reboot.c:755) >>> __arm64_sys_reboot (kernel/reboot.c:715) >>> invoke_syscall (arch/arm64/include/asm/current.h:19 >>> arch/arm64/kernel/syscall.c:53) >>> el0_svc_common.constprop.0 (include/linux/thread_info.h:127 >>> arch/arm64/kernel/syscall.c:141) >>> do_el0_svc (arch/arm64/kernel/syscall.c:153) >>> el0_svc (arch/arm64/include/asm/irqflags.h:56 >>> arch/arm64/include/asm/irqflags.h:77 arch/arm64/kernel/entry-common.c:165 >>> arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) >>> el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) >>> el0t_64_sync (arch/arm64/kernel/entry.S:598) >>> >>> The watchdog restart handler is added in restart_handler_list and this list >>> is invoked though do_kernel_restart(). As of my code investigation the >>> restart_handler_list is invoked only though do_kernel_restart() and only >>> though the stack trace above. >>> >>> Thank you, >>> Claudiu Beznea >>> >>>> >>>>> clk_prepare_enable(priv->pclk); >>>>> clk_prepare_enable(priv->osc_clk); >>>>> >>>>> -- >>>>> 2.39.2 >>>>> >>>>> >>>> >>>> Can you redirectly me to the complete series, so I can have a better >>>> overview of the problem? >>> >>> This is the series that adds power domain support for RZ/G3S SoC: >>> https://lore.kernel.org/all/20240410122657.2051132-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/ >>> >>> This is the series that adds watchdog support for RZ/G3S SoC: >>> https://lore.kernel.org/all/20240410134044.2138310-1-claudiu.beznea.uj@xxxxxxxxxxxxxx/ >>> >>> Thank you for your review, >>> Claudiu Beznea >>> >>>> >>>> Kind regards >>>> Uffe