Re: [PATCH v1 1/1] Revert "pinctrl: avoid unsafe code pattern in find_pinctrl()"

Linus Walleij <linus.walleij@xxxxxxxxxx> · Thu, 19 Oct 2023 10:12:57 +0200

On Thu, Oct 19, 2023 at 12:41 AM Dmitry Torokhov
<dmitry.torokhov@xxxxxxxxx> wrote:

> Linus, BTW, I think there are more problems there with pinctrl lookup,
> because, if we assume there are concurrent accesses to pinctrl_get(),
> the fact that we did not find an instance while scanning the list does
> not mean we will not find it when we go to insert a newly created one.

I'm not surprised. Pinctrl comes from a time when the probing was
mostly serial, and since subsystems (such as MMC) has increasingly
added asynchronous probing, accompanied by rework of device tree
core etc (device tree didn't even exist when we started pin control)
to parallelize probing based on device hierarchy topology etc.

The people making probes ever more asynchronous probably just
tested if the system still boots and did not bother to go and look at
any rough edges in resource supplying subsystems, including clk,
regulator, gpio, reset, pin control...

There is a reason to why it mostly works, which I explain below:

> Another problem, as far as I can see, that there is not really a defined
> owner of pinctrl structure, it is created on demand, and destroyed when
> last user is gone. So if we execute last pintctrl_put() and there is
> another pinctrl_get() running simultaneously, we may get and bump up the
> refcount, and then release (pinctrl_free) will acquire the mutex, and
> zap the structure.

You mean we need to acquire the mutex in the code that calls
find_pinctrl() instead of inside find_pinctrl()? Yes I think you're right,
wanna do a patch?

It is largely theoretical because of the following:

A pin control handle is
usually taken by a driver for a device, it is usually unique
for that exact hardware (in difference from a clock, or a regulator,
which is often shared), so the scenario you are designing for here
would be that the driver for a device is probing the *same* hardware
on two runpaths, which is not going happen, right?

So while the software is not race-safe, the nature of the hardware
makes it safe: there is just one device instance per pin control
handle.

I haven't thought it through in detail so there may be corner
cases.

> Given that there are more issues in that code, maybe we should revert
> the patch for now so Andy has a chance to convert to UUID/LABEL booting?

Yeah I reverted it, the above elaboration may apply to this patch
too and makes me feel we are "mostly safe" in this regard anyway.

Yours,
Linus Walleij