Quoting Doug Anderson (2024-03-28 09:39:54) > > I spent a bunch of time discussing this offline with Stephen and I'll > try to summarize. Hopefully this isn't too much nonsense... > > 1. We'll likely land the patches downstream in ChromeOS for now while > we're figuring things out since we're seeing actual breakages. Whether > to land upstream is a question. The first patch is a bit of a hack but > unlikely to cause any real problems. The second patch seems correct > but it also feels like it's going to cause stuck clocks for a pile of > other SoCs because we're not adding hacks similar to the sc7180 hack > for all the other SoCs. I guess we could hope we get lucky or play > whack-a-mole? ...or we try to find a more generic solution... Dunno > what others think. I think we should hope to get lucky or play whack-a-mole and merge something like this series. If we have to we can similarly turn off RCGs or branches during driver probe that are using shared parents like we have on sc7180. Put simply, the shared RCG implementation is broken because it reports the wrong parent for clk_ops::get_parent() and doesn't clear the force enable bit. With the current code we're switching the parent to XO when the clk is enabled the first time. That's an obvious bug that we should fix regardless of implementing proper clk handoff. We haven't implemented handoff in over a decade. Blocking this bug fix on implementing handoff isn't practical. Furthermore, we're relying on clk consumers to clear that force enable bit by enabling the clk once. That doesn't make any sense, although we could use that force enable bit to consider the RCG as enabled for clk_disable_unused. An alternative approach to this series would be to force all shared RCGs to be parented to XO at clk registration time, and continue to clear that RCG force enable bit. That's sort of what Dmitry was going for earlier. Doing this would break anything that's relying on the clks staying enabled at some frequency through boot, but that isn't supported anyway because clk handoff isn't implemented. It avoids the problem that the first patch is for too because XO doesn't turn off causing a clk to get stuck on. I can certainly craft this patch up if folks think that's better. To ease the transition we can make a new clk_ops for the RCG as well so that each SoC has to opt-in to use this behavior. Then we can be certain that other platforms aren't affected without being tested first. I'd prefer to not do that though because I fear we'll be leaving drivers in the broken state for some time.