Hi, On 30.03.2022 10:47, Maxime Ripard wrote: > On Wed, Mar 30, 2022 at 10:06:13AM +0200, Marek Szyprowski wrote: >> On 25.03.2022 17:11, Maxime Ripard wrote: >>> While the current code will trigger a new clk_set_rate call whenever the >>> rate boundaries are changed through clk_set_rate_range, this doesn't >>> occur when clk_put() is called. >>> >>> However, this is essentially equivalent since, after clk_put() >>> completes, those boundaries won't be enforced anymore. >>> >>> Let's add a call to clk_set_rate_range in clk_put to make sure those >>> rate boundaries are dropped and the clock drivers can react. >>> >>> Let's also add a few tests to make sure this case is covered. >>> >>> Fixes: c80ac50cbb37 ("clk: Always set the rate on clk_set_range_rate") >>> Signed-off-by: Maxime Ripard <maxime@xxxxxxxxxx> >> This patch landed recently in linux-next 20220328 as commit 7dabfa2bc480 >> ("clk: Drop the rate range on clk_put()"). Sadly it breaks booting of >> the few of my test systems: Samsung ARM 32bit Exynos3250 based Rinato >> board and all Amlogic Meson G12B/SM1 based boards (Odroid C4, N2, Khadas >> VIM3/VIM3l). Rinato hangs always with the following oops: >> >> --->8--- >> >> Kernel panic - not syncing: MCT hangs after writing 4 (offset:0x420) >> CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.17.0-rc1-00014-g7dabfa2bc480 >> #11551 >> Hardware name: Samsung Exynos (Flattened Device Tree) >> unwind_backtrace from show_stack+0x10/0x14 >> show_stack from dump_stack_lvl+0x58/0x70 >> dump_stack_lvl from panic+0x10c/0x328 >> panic from exynos4_mct_tick_stop+0x0/0x2c >> ---[ end Kernel panic - not syncing: MCT hangs after writing 4 >> (offset:0x420) ]--- >> >> --->8--- >> >> Amlogic boards hang randomly during early userspace init, usually just >> after loading the driver modules. >> >> Reverting $subject on top of linux-next fixes all those problems. >> >> I will try to analyze it a bit more and if possible provide some more >> useful/meaning full logs later. > I'm not sure what could go wrong there, but if you can figure out the > clock, if it tries to set a new rate and what rate it is, it would be > awesome :) So far I've noticed that the problem is caused by setting rate of some clocks to zero. The following patch fixes my issues: diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c index 32a9eaf35c6b..39cab08dbecb 100644 --- a/drivers/clk/clk.c +++ b/drivers/clk/clk.c @@ -2201,6 +2201,9 @@ static int clk_core_set_rate_nolock(struct clk_core *core, if (!core) return 0; + if (req_rate == 0) + return 0; + rate = clk_core_req_round_rate_nolock(core, req_rate); /* bail early if nothing to do */ -- I will soon grab the call stack and relevant clock topology show how the rate is set to zero. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland