Re: [PATCH 0/5] clk: Make clk_rate_exclusive_get() return void

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Wed, Dec 13, 2023 at 08:43:00AM +0100, Uwe Kleine-König wrote:
> On Wed, Dec 13, 2023 at 08:16:04AM +0100, Maxime Ripard wrote:
> > On Tue, Dec 12, 2023 at 06:26:37PM +0100, Uwe Kleine-König wrote:
> > > clk_rate_exclusive_get() returns zero unconditionally. Most users "know"
> > > that and don't check the return value. This series fixes the four users
> > > that do error checking on the returned value and then makes function
> > > return void.
> > > 
> > > Given that the changes to the drivers are simple and so merge conflicts
> > > (if any) should be easy to handle, I suggest to merge this complete
> > > series via the clk tree.
> > 
> > I don't think it's the right way to go about it.
> > 
> > clk_rate_exclusive_get() should be expected to fail. For example if
> > there's another user getting an exclusive rate on the same clock.
> > 
> > If we're not checking for it right now, then it should probably be
> > fixed, but the callers checking for the error are right to do so if they
> > rely on an exclusive rate. It's the ones that don't that should be
> > modified.
> 
> If some other consumer has already "locked" a clock that I call
> clk_rate_exclusive_get() for, this isn't an error. In my bubble I call
> this function because I don't want the rate to change e.g. because I
> setup some registers in the consuming device to provide a fixed UART
> baud rate or i2c bus frequency (and that works as expected).

I guess it's a larger conversation, but I don't see how that can
possibly work.

The way the API is designed, you have no guarantee (outside of
clk_rate_exclusive_*) that the rate is going to change.

And clk_rate_exclusive_get() doesn't allow the rate to change while in
the "critical section".

So the only possible thing to do is clk_set_rate() +
clk_rate_exclusive_get().

So there's a window where the clock can indeed be changed, and the
consumer that is about to lock its rate wouldn't be aware of it.

I guess it would work if you don't care about the rate at all, you just
want to make sure it doesn't change.

Out of the 7 users of that function, 3 are in that situation, so I guess
it's fair.

3 are open to that race condition I mentioned above.

1 is calling clk_set_rate while in the critical section, which works if
there's a single user but not if there's multiple, so it should be
discouraged.

> In this case I won't be able to change the rate of the clock, but that
> is signalled by clk_set_rate() failing (iff and when I need awother
> rate) which also seems the right place to fail to me.

Which is ignored by like half the callers, including the one odd case I
mentioned above.

And that's super confusing still: you can *always* get exclusivity, but
not always do whatever you want with the rate when you have it? How are
drivers supposed to recover from that? You can handle failing to get
exclusivity, but certainly not working around variable guarantees.

> It's like that since clk_rate_exclusive_get() was introduced in 2017
> (commit 55e9b8b7b806ec3f9a8817e13596682a5981c19c).

Right, but "it's always been that way" surely can't be an argument,
otherwise you wouldn't have done that series in the first place.

> BTW, I just noticed that my assertion "Most users \"know\" that
> [clk_rate_exclusive_get() returns zero unconditionally]" is wrong. As of
> next-20231213 there are 3 callers ignoring the return value of
> clk_rate_exclusive_get() and 4 that handle (imaginary) returned errors.
> I expected this function to be used more extensively. (In fact I think
> it should be used more as several drivers rely on the clk rate not
> changing.)

Yes, but also it's super difficult to use in practice, and most devices
don't care.

The current situation is something like this:

  * Only a handful of devices really care about their clock rate, and
    often only for one of their clock if they have several. You would
    probably get all the devices that create an analog signal somehow
    there, so audio, display, i2c, spi, uarts, etc. Plus the ones doing
    frequency scaling so CPU and GPUs.

  * CPUs and GPUs are very likely to have a dedicated clock, so we can
    rule the "another user is going to mess with my clock" case.

  * UARTs/i2c/etc. are usually taking their clock from the bus interface
    directly which is pretty much never going to change (for good
    reason). And the rate of the bus is not really likely to change.

  * SPI/NAND/MMC usually have their dedicated clock too, and the bus
    rate is not likely to change after the initial setup either.

So, the only affected devices are the ones generating external signals,
with the rate changing during the life of the system. Even for audio or
video devices, that's fairly unlikely to happen. And you need to have
multiple devices sharing the same clock tree for that issue to occur,
which is further reducing the chances it happens.

Realistically speaking, this only occurs with multi-head display outputs
where it's somewhat likely to have all the display controllers feeding
from the same clock, and the power up of the various output is done in
sequence which creates that situation.

And even then, the clk_rate_exclusive_* interface effectively locks the
entire clock subtree to its current rate, so the effect on the rest of
the devices can be significant.

So... yeah. Even though you're right, it's trying to address a problem
that is super unlikely to happen with a pretty big hammer that might be
too much for most. So it's not really surprising it's not used more.

Maxime

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux