On 2023/2/10 23:28, Guenter Roeck wrote: > On 2/9/23 23:01, Xingyu Wu wrote: >> On 2023/2/2 6:46, Guenter Roeck wrote: >>> On Mon, Dec 19, 2022 at 05:42:32PM +0800, Xingyu Wu wrote: >>>> Add watchdog driver for the StarFive JH7110 SoC. >>>> >>>> Signed-off-by: Xingyu Wu <xingyu.wu@xxxxxxxxxxxxxxxx> >>> > >[...] > >>> >>>> +MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default=" >>>> + __MODULE_STRING(WATCHDOG_NOWAYOUT) ")"); >>>> +MODULE_PARM_DESC(soft_noboot, >>>> + "Watchdog action, set to 1 to ignore reboots, 0 to reboot (default 0)"); >>> >>> I do not understand what this module parameter is supposed to be used for, >>> and what the "soft_' prefix is supposed to mean. >> >> This 'soft_noboot' means watchdog reset enabled status. If 'soft_noboot' is set to 1, >> it means reset is disabled and do not reboot.Or 'reboot_disbled' instead? >> > > That means it does nothing ? Why load the watchdog in the first place then ? > This feature is used for debugging, so user can check the counter repeatedly without rebooting the system. >[...] >>>> + >>>> +/* interrupt status whether has been raised from the counter */ >>>> +static bool starfive_wdt_raise_irq_status(struct starfive_wdt *wdt) >>>> +{ >>>> + return !!readl(wdt->base + wdt->drv_data->irq_is_raise); >>>> +} >>>> + >>>> +static void starfive_wdt_int_clr(struct starfive_wdt *wdt) >>>> +{ >>>> + writel(STARFIVE_WDT_INTCLR, wdt->base + wdt->drv_data->int_clr); >>>> +} >>> >>> There is no explanation for this interrupt handling (or, rather, >>> non-handling since there is no interrupt handler. What is the purpose >>> of even having all this code ? >> >> Because the watchdog raise an interrupt signal on the hardware when timeout, >> although we do not use interrupt handler on the sorfware, but the watchdog >> initialization or reload also need to clear the hardware interrupt signal. >> > > That should be documented. > I will add that in the comments. > >[...] >>>> + >>>> + /* >>>> + * This watchdog takes twice timeouts to reset. >>>> + * In order to reduce time to reset, should set half count value. >>>> + */ >>>> + count = timeout * freq / 2; >>>> + >>>> + if (count > STARFIVE_WDT_MAXCNT) { >>> >>> count is an unsigned int, which is pretty much everywhere a 32-bit >>> value. STARFIVE_WDT_MAXCNT is 0xffffffff. How is an unsigned integer >>> ever supposed to be larger than 0xffffffff ? >>> >>>> + dev_warn(wdt->dev, "timeout %d too big,use the MAX-timeout set.\n", >>>> + timeout); >>>> + timeout = starfive_wdt_max_timeout(wdt); >>>> + count = timeout * freq; >>> >>> This is confusing. First, the timeout range is checked by the calling code, >>> which makes sure it is never larger than max_timeout. max_timeout is >>> set to the value returned by starfive_wdt_max_timeout(). >>> Thus, count = timeout * freq / 2 will _never_ be larger than >>> STARFIVE_WDT_MAXCNT. Even if it ever was, it doesn't make sense >>> to use a count value of timeout * freq in that case, ie a value which >>> could be twice as large as the supposed maximum value. >> >> Change 'count' type to 'u64'. And if 'count' is larger than STARFIVE_WDT_MAXCNT, >> 'count' is equal to STARFIVE_WDT_MAXCNT. Does that seem OK? >> > > That would not change anything. This is not about overflows; I would > have mentioned that. count can still never be larger than STARFIVE_WDT_MAXCNT. > Please do the math. > I get your point. It has checked the 'count' before the ops of 'set_timeout' so I check the 'count' uselessly in this. I will remove this. > >[...] >>>> + >>>> + if (tmr_atboot && started == 0) { >>>> + starfive_wdt_start(&wdt->wdt_device); >>>> + } else if (!tmr_atboot) { >>>> + /* >>>> + *if we're not enabling the watchdog, then ensure it is >>>> + * disabled if it has been left running from the bootloader >>>> + * or other source. >>>> + */ >>>> + starfive_wdt_stop(&wdt->wdt_device); >>> >>> If it _is_ running from the boot loader, the watchdog core is not >>> informed about it. If it is started here, it is not informed either. >>> This is unusual and will need to be explained. >>> Why ? >> >> Is is okay to remove the 'started'? >> > Yes, though of course it would be better if the watchdog is kept running > in that situation and the watchdog core is informed about it. Also, > the watchdog core is still not informed that the watchdog is running > (i.e., WDOG_HW_RUNNING is not set) when it is explicitly started. > Will remove the 'started'. >>> >>>> + } >>>> + >>>> +#ifdef CONFIG_PM >>>> + clk_disable_unprepare(wdt->core_clk); >>>> + clk_disable_unprepare(wdt->apb_clk); >>>> +#endif >>> >>> I do not understand the above code. Why stop the watchdog if CONFIG_PM >>> is enabled, even if it is supposed to be running ? >> >> There misses a check about 'early_enable' and add 'if (!early_enable)'. >> There means that disable clock when watchdog sleep and CONFIG_PM is enable. >> Then enable clock when watchdog work by 'starfive_wdt_runtime_resume' function. >> > > I am quite sure that you are supposed to use pm functions for that purpose, > such as pm_runtime_get_sync(), pm_runtime_put_sync(), and pm_runtime_enable(), > similar to the code in omap_wdt.c. > I will use pm_runtime_get_sync() and pm_runtime_put_sync() to operate clocks. >[...] >>>> +#ifdef CONFIG_PM_SLEEP >>>> +static int starfive_wdt_suspend(struct device *dev) >>>> +{ >>>> + int ret; >>>> + struct starfive_wdt *wdt = dev_get_drvdata(dev); >>>> + >>>> + starfive_wdt_unlock(wdt); >>>> + >>>> + /* Save watchdog state, and turn it off. */ >>>> + wdt->reload = starfive_wdt_get_count(wdt); >>>> + >>>> + starfive_wdt_mask_and_disable_reset(wdt, true); >>>> + >>>> + /* Note that WTCNT doesn't need to be saved. */ >>>> + starfive_wdt_stop(&wdt->wdt_device); >>>> + pm_runtime_force_suspend(dev); >>>> + >>>> + starfive_wdt_lock(wdt); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static int starfive_wdt_resume(struct device *dev) >>>> +{ >>>> + int ret; >>>> + struct starfive_wdt *wdt = dev_get_drvdata(dev); >>>> + >>>> + starfive_wdt_unlock(wdt); >>>> + >>>> + pm_runtime_force_resume(dev); >>>> + >>>> + /* Restore watchdog state. */ >>>> + starfive_wdt_set_reload_count(wdt, wdt->reload); >>>> + >>>> + starfive_wdt_restart(&wdt->wdt_device, 0, NULL); >>> >>> I do not understand this call. Per its definition it is a restart handler, >>> supposed to restart the hardware. Why would anyone want to restart the >>> system on resume ? >> >> The watchdog start counting from 'count' to 0 on everytime resume like a restart. >> So I directly use a restart. >> > > That doesn't answer my question. The "restart" callback resets the hardware. > starfive_wdt_restart() is registered as restart handler, and thus expected > to reset the hardware. It it doesn't reset the hardware, it should not > register itself as restart handler. If it does restart the hardware, calling > it on resume would automatically reset the system on each resume. > Something is wrong here, and will have to be fixed. > > I _suspect_ that you think that the restart callback is supposed to reset > the watchdog. That would be wrong. It resets (restarts) the hardware, > not the watchdog. Please read Documentation/watchdog/watchdog-kernel-api.rst > if there are questions about this callback. > Thanks you for reminding me. I finally understand that the restart callback is supposed to reset the hardware not watchdog. This watchdog doesn't reset the hardware, and I will remove that. By the way, if I don't need restart callback, would I still use the 'watchdog_set_restart_priority' function? Thanks. Best regards, Xingyu Wu