Re: [PATCH 0/5] clk: Make clk_rate_exclusive_get() return void

Jerome Brunet <jbrunet@xxxxxxxxxxxx> · Fri, 15 Dec 2023 10:46:29 +0100

On Wed 13 Dec 2023 at 17:44, Neil Armstrong <neil.armstrong@xxxxxxxxxx> wrote:

> Hi Maxime,
>
> Le 13/12/2023 à 09:36, Maxime Ripard a écrit :
>> Hi,
>> On Wed, Dec 13, 2023 at 08:43:00AM +0100, Uwe Kleine-König wrote:
>>> On Wed, Dec 13, 2023 at 08:16:04AM +0100, Maxime Ripard wrote:
>>>> On Tue, Dec 12, 2023 at 06:26:37PM +0100, Uwe Kleine-König wrote:
>>>>> clk_rate_exclusive_get() returns zero unconditionally. Most users "know"
>>>>> that and don't check the return value. This series fixes the four users
>>>>> that do error checking on the returned value and then makes function
>>>>> return void.
>>>>>
>>>>> Given that the changes to the drivers are simple and so merge conflicts
>>>>> (if any) should be easy to handle, I suggest to merge this complete
>>>>> series via the clk tree.
>>>>
>>>> I don't think it's the right way to go about it.
>>>>
>>>> clk_rate_exclusive_get() should be expected to fail. For example if
>>>> there's another user getting an exclusive rate on the same clock.
>>>>
>>>> If we're not checking for it right now, then it should probably be
>>>> fixed, but the callers checking for the error are right to do so if they
>>>> rely on an exclusive rate. It's the ones that don't that should be
>>>> modified.
>>>
>>> If some other consumer has already "locked" a clock that I call
>>> clk_rate_exclusive_get() for, this isn't an error. In my bubble I call
>>> this function because I don't want the rate to change e.g. because I
>>> setup some registers in the consuming device to provide a fixed UART
>>> baud rate or i2c bus frequency (and that works as expected).
>> I guess it's a larger conversation, but I don't see how that can
>> possibly work.
>> The way the API is designed, you have no guarantee (outside of
>> clk_rate_exclusive_*) that the rate is going to change.
>> And clk_rate_exclusive_get() doesn't allow the rate to change while in
>> the "critical section".
>> So the only possible thing to do is clk_set_rate() +
>> clk_rate_exclusive_get().
>
> There's clk_set_rate_exclusive() for this purpose.
>
>> So there's a window where the clock can indeed be changed, and the
>> consumer that is about to lock its rate wouldn't be aware of it.
>> I guess it would work if you don't care about the rate at all, you just
>> want to make sure it doesn't change.
>> Out of the 7 users of that function, 3 are in that situation, so I guess
>> it's fair.
>> 3 are open to that race condition I mentioned above.
>> 1 is calling clk_set_rate while in the critical section, which works if
>> there's a single user but not if there's multiple, so it should be
>> discouraged.
>> 
>>> In this case I won't be able to change the rate of the clock, but that
>>> is signalled by clk_set_rate() failing (iff and when I need awother
>>> rate) which also seems the right place to fail to me.
>> Which is ignored by like half the callers, including the one odd case I
>> mentioned above.
>> And that's super confusing still: you can *always* get exclusivity, but
>> not always do whatever you want with the rate when you have it? How are
>> drivers supposed to recover from that? You can handle failing to get
>> exclusivity, but certainly not working around variable guarantees.
>> 
>>> It's like that since clk_rate_exclusive_get() was introduced in 2017
>>> (commit 55e9b8b7b806ec3f9a8817e13596682a5981c19c).
>> Right, but "it's always been that way" surely can't be an argument,
>> otherwise you wouldn't have done that series in the first place.
>> 
>>> BTW, I just noticed that my assertion "Most users \"know\" that
>>> [clk_rate_exclusive_get() returns zero unconditionally]" is wrong. As of
>>> next-20231213 there are 3 callers ignoring the return value of
>>> clk_rate_exclusive_get() and 4 that handle (imaginary) returned errors.
>>> I expected this function to be used more extensively. (In fact I think
>>> it should be used more as several drivers rely on the clk rate not
>>> changing.)
>> Yes, but also it's super difficult to use in practice, and most devices
>> don't care.
>> The current situation is something like this:
>>    * Only a handful of devices really care about their clock rate, and
>>      often only for one of their clock if they have several. You would
>>      probably get all the devices that create an analog signal somehow
>>      there, so audio, display, i2c, spi, uarts, etc. Plus the ones doing
>>      frequency scaling so CPU and GPUs.
>>    * CPUs and GPUs are very likely to have a dedicated clock, so we can
>>      rule the "another user is going to mess with my clock" case.
>>    * UARTs/i2c/etc. are usually taking their clock from the bus interface
>>      directly which is pretty much never going to change (for good
>>      reason). And the rate of the bus is not really likely to change.
>>    * SPI/NAND/MMC usually have their dedicated clock too, and the bus
>>      rate is not likely to change after the initial setup either.
>> So, the only affected devices are the ones generating external signals,
>> with the rate changing during the life of the system. Even for audio or
>> video devices, that's fairly unlikely to happen. And you need to have
>> multiple devices sharing the same clock tree for that issue to occur,
>> which is further reducing the chances it happens.
>
> Well, thanks for HW designers, this exists and some SoCs has less PLLs than
> needed, and they can't be dedicated for some hw blocks.

Indeed. Even if there are enough PLLs, the exclusive API might help.
The idea is to force the second consumer to pick another "free" PLL if
it needs another rate.

If it can work with the rate currently locked by the first consumer,
then CCF may just pick that one, saving a PLL for future use. If it cant,
the protection will force the use of another PLL.

Without the exclusive API, the second consummer may just wreck the PLL of
the first consummer, regardless of the other PLL available.

Of course, if there is enough PLL, the other solution is manual
allocation, using assigned-parent and CLK_NO_REPARENT.

>
>> Realistically speaking, this only occurs with multi-head display outputs
>> where it's somewhat likely to have all the display controllers feeding
>> from the same clock, and the power up of the various output is done in
>> sequence which creates that situation.
>> And even then, the clk_rate_exclusive_* interface effectively locks the
>> entire clock subtree to its current rate, so the effect on the rest of
>> the devices can be significant.
>> So... yeah. Even though you're right, it's trying to address a problem
>> that is super unlikely to happen with a pretty big hammer that might be
>> too much for most. So it's not really surprising it's not used more.
>
> Honestly I tried my best to find a smart way to set the DSI clock tree
> with only 2 endpoints of the tree, but CCF will explore all possibilities
> and since you cannot set constraints, locking a sub-tree is the smartest
> way I found.
> In this case, the PLL is common between the DSI controller and video generator,
> so to keep the expected clock ratio, the smart way is to set the freq on
> one side, lock the subtree and set the rate on the other side.
> An API permitting to set multiple rates to multiple clocks in a single call
> would be the solution, but not sure if we could possibly write such algorithm.
>
>> Maxime
>
> Neil

-- 
Jerome