On Wed, Dec 20, 2023 at 8:22 AM Tudor Ambarus <tudor.ambarus@xxxxxxxxxx> wrote: > > Hi, Sam! > > On 12/19/23 17:31, Sam Protsenko wrote: > > On Tue, Dec 19, 2023 at 10:47 AM Tudor Ambarus <tudor.ambarus@xxxxxxxxxx> wrote: > >> > >> Hi, Sam! > >> > >> On 12/14/23 16:43, Sam Protsenko wrote: > >>> On Thu, Dec 14, 2023 at 10:15 AM Tudor Ambarus <tudor.ambarus@xxxxxxxxxx> wrote: > >>>> > >>>> > >>>> > >>>> On 12/14/23 16:09, Sam Protsenko wrote: > >>>>> On Thu, Dec 14, 2023 at 10:01 AM Tudor Ambarus <tudor.ambarus@xxxxxxxxxx> wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 12/14/23 15:37, Sam Protsenko wrote: > >>>>>>> On Thu, Dec 14, 2023 at 4:52 AM Tudor Ambarus <tudor.ambarus@xxxxxxxxxx> wrote: > >>>>>>>> > >>>>>>>> Testing USI8 I2C with an eeprom revealed that when the USI8 leaf clock > >>>>>>>> is disabled it leads to the CMU_TOP PERIC0 IP gate clock disablement, > >>>>>>>> which then makes the system hang. To prevent this, mark > >>>>>>>> CLK_GOUT_CMU_PERIC0_IP as critical. Other clocks will be marked > >>>>>>>> accordingly when tested. > >>>>>>>> > >>>>>>>> Signed-off-by: Tudor Ambarus <tudor.ambarus@xxxxxxxxxx> > >>>>>>>> --- > >>>>>>>> drivers/clk/samsung/clk-gs101.c | 2 +- > >>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/clk/samsung/clk-gs101.c b/drivers/clk/samsung/clk-gs101.c > >>>>>>>> index 3d194520b05e..08d80fca9cd6 100644 > >>>>>>>> --- a/drivers/clk/samsung/clk-gs101.c > >>>>>>>> +++ b/drivers/clk/samsung/clk-gs101.c > >>>>>>>> @@ -1402,7 +1402,7 @@ static const struct samsung_gate_clock cmu_top_gate_clks[] __initconst = { > >>>>>>>> "mout_cmu_peric0_bus", CLK_CON_GAT_GATE_CLKCMU_PERIC0_BUS, > >>>>>>>> 21, 0, 0), > >>>>>>>> GATE(CLK_GOUT_CMU_PERIC0_IP, "gout_cmu_peric0_ip", "mout_cmu_peric0_ip", > >>>>>>>> - CLK_CON_GAT_GATE_CLKCMU_PERIC0_IP, 21, 0, 0), > >>>>>>>> + CLK_CON_GAT_GATE_CLKCMU_PERIC0_IP, 21, CLK_IS_CRITICAL, 0), > >>>>>>> > >>>>>>> This clock doesn't seem like a leaf clock. It's also not a bus clock. > >>>>>>> Leaving it always running makes the whole PERIC0 CMU clocked, which > >>>>>>> usually should be avoided. Is it possible that the system freezes > >>>>>>> because some other clock (which depends on peric0_ip) gets disabled as > >>>>>>> a consequence of disabling peric0_ip? Maybe it's some leaf clock which > >>>>>>> is not implemented yet in the clock driver? Just looks weird to me > >>>>>>> that the system hangs because of CMU IP clock disablement. It's > >>>>>>> usually something much more specific. > >>>>>> > >>>>>> The system hang happened when I tested USI8 in I2C configuration with an > >>>>>> eeprom. After the eeprom is read the leaf gate clock that gets disabled > >>>>>> is the one on PERIC0 (CLK_GOUT_PERIC0_CLK_PERIC0_USI8_USI_CLK). I assume > >>>>>> this leads to the CMU_TOP gate (CLK_CON_GAT_GATE_CLKCMU_PERIC0_IP) > >>>>>> disablement which makes the system hang. Either marking the CMU_TOP gate > >>>>>> clock as critical (as I did in this patch) or marking the leaf PERIC0 > >>>>>> gate clock as critical, gets rid of the system hang. Did I choose wrong? > >>>>>> > >>>>> > >>>>> Did you already implement 100% of clocks in CMU_PERIC0? If no, there > >>>> > >>>> yes. > >> > >> I checked again all the clocks. I implemented all but one, the one > >> defined by the CLK_CON_BUF_CLKBUF_PERIC0_IP register. Unfortunately I > >> don't have any reference on how it should be defined so I won't touch it > >> yet. But I have some good news too, see below. > >> > >>> > >>> Ok. Are there any other CMUs (perhaps not implemented yet) which > >>> consume clocks from CMU_PERIC0, specifically PERIC0_IP clock or some > >>> clocks derived from it? If so, is there a chance some particular leaf > >>> clock in those CMUs actually renders the system frozen when disabled > >>> as a consequence of disabling PERIC0_IP, and would explain better why > >>> the freeze happens? > >>> > >>> For now I think it's ok to have that CLK_IS_CRITICAL flag here, > >>> because as you said you implemented all clocks in this CMU and neither > >>> of those looks like a critical one. But I'd advice to add a TODO > >>> comment saying it's probably a temporary solution before actual leaf > >>> clock which leads to freeze is identified (which probably resides in > >>> some other not implemented yet CMU). > >>> > >>>> > >>>>> is a chance some other leaf clock (which is not implemented yet in > >>>>> your driver) gets disabled as a result of PERIC0_IP disablement, which > >>>>> might actually lead to that hang you observe. Usually it's some > >>>>> meaningful leaf clock, e.g. GIC or interconnect clocks. Please check > >>>>> clk-exynos850.c driver for CLK_IS_CRITICAL and CLK_IGNORE_UNUSED flags > >>>>> and the corresponding comments I left there, maybe it'll give you more > >>>>> particular idea about what to look for. Yes, making the whole CMU > >>>>> always running without understanding why (i.e. because of which > >>>>> particular leaf clock) might not be the best way of handling this > >>>> > >>>> because of CLK_GOUT_PERIC0_CLK_PERIC0_USI8_USI_CLK > >>> > >>> That's not a root cause here. And I think PERIC0_IP is neither. > >>> > >> > >> you were right! > >>>> > >>>>> issue. I might be mistaken, but at least please check if you > >>>>> implemented all clocks for PERIC0 first and if making some meaningful > >>>>> leaf clock critical makes more sense. > >>>>> > >> > >> I determined which leaf clocks shall be marked as critical. I enabled > >> the debugfs clock write access. Then I made sure that the parents of the > >> PERIC0 CMU have at least one user so that they don't get disabled after > >> an enable-disable sequence on a leaf clock. The I took all the PERIC0 > >> gate clocks and enabled and disabled them one by one. Whichever hang the > >> system when the clock was disabled was marked as critical. The list of > >> critical leaf clocks is as following: > >> > > > > Nice! I used somehow similar procedure for clk-exynos850, doing > > basically the same thing, but in core clock driver code. > > > >> "gout_peric0_peric0_cmu_peric0_pclk", > >> "gout_peric0_lhm_axi_p_peric0_i_clk", > >> "gout_peric0_peric0_top1_ipclk_0", > >> "gout_peric0_peric0_top1_pclk_0". > >> > >> I'll update v2 with this instead. Thanks for the help, Sam! > > > > Glad you weren't discouraged by my meticulousness :) In clk-exynos850 > > I usually used CLK_IGNORE_UNUSED for clocks like XXX_CMU_XXX (in your > > case it's PERIC0_CMU_PERIC0), with a corresponding comment. Those > > clocks usually can be used to disable the bus clock for corresponding > > CMU IP-core (in your case CMU_PERIC0), which makes it impossible to > > access the registers from that CMU block, as its register interface is > > not clocked anymore. Guess I saw something similar in Exynos5433 or > > Exynos7 clk drivers, or maybe Sylwester or Krzysztof told me to do so > > -- don't really remember. For AXI clock it also seems logical to keep > > it running (AXI bus might be used for GIC and memory). But again, > > maybe CLK_IGNORE_UNUSED flag would be more appropriate that > > CLK_IS_CRITICAL? For the last two clocks -- it's hard to tell what > > exactly they do. Is TOP1 some other CMU or block name, and is there > > any further users for those clocks? > > > > Anyways, if you are working on v2, please consider doing next two > > things while at it: > > > > 1. For each critical clock: add corresponding comment explaining why > > it's marked so > > Will do. > > > 2. Consider using CLK_IGNORE_UNUSED instead of CLK_IS_CRITICAL when > > appropriate; both have their use in different cases > > > > Btw, if you check other Exynos clk drivers, there is a lot of examples > > for flags like those. > > > Thanks for the feedback, it's educative. > > I played a little with the clk debugfs and I think all should be marked > as critical. What I did was to make sure that their parents are enabled > already and then I enabled and disabled each. Each time I disabled one > of them the system hung. Thus in case they will be used, if one disable > them on an error path, it will hang the system. We can't disable them at > suspend either. Thus I propose to keep them as critical. > Do you see those clocks potentially used by some actual consumers in future? If no, maybe CLK_IGNORE_UNUSED is enough (just to make sure the core clock framework won't disable those during the clocks initialization)? Anyway, I don't have any strong preferences in this case. If you think CLK_IS_CRITICAL is better in this case, I'd say go for it. Also, on a bit different note: please make sure there is no "clk_ignore_unused" param in your kernel cmdline (e.g. passed from the bootloader via dts). The clock driver should be functional without that param. Though it might take some additional work. > Thanks! > ta