Re: [GIT PULL] gpio: fixes for v6.1-rc8 - take 2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Dec 4, 2022 at 9:17 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sat, Dec 3, 2022 at 7:05 AM Bartosz Golaszewski <brgl@xxxxxxxx> wrote:
> >
> > Here's a fixed PR from the GPIO subsystem for the next rc.
>
> No, this cannot be right.
>
> That last commit seems *very* dubious, and in particular all those
>
>         if (!down_read_trylock(&gdev->sem))
>                 return EPOLLHUP | EPOLLERR;
>
> are a sign that something is very very wrong there.
>
> Either the lock is necessary or it isn't, and "trylock" isn't the way
> to deal with it, with random failures if you cannot take the lock.
>
> If you are using "trylock" because the data structure might go away
> from under you, you have already lost, and the code is buggy.
>
> And if the data structure cannot go away from under  you,  you should
> do an unconditional lock, and then check "gdev->chip" for being NULL
> once you have gotten the lock (the same way you did in open()).
>

No, the data can't be removed with these locks in place. It's just to
avoid going into the callbacks if gpiochip_remove() is already in
progress (the only reason why trylock would fail).

> But a "trylock and return error if it failed" just means that now you
> are randomly returning errors to user space, which is entirely
> undebuggable and makes no sense.
>

Technically these are the same errors we'll return later if gdev->chip
is NULL but I get your point.

> Or, alternatively, the trylock succeeds - because it hits fully
> *after* gpiochip_remove() has finished, and now ->chip is NULL anyway,
> which is what you claim to protect against.
>
> End result: "trylock" can never be right in this kind of context.
>
> That "call_locked() helper might make sense more along the lines of
>
>         ret = -ENODEV;
>
>         down_read(&gdev->sem))
>         // Does the device still exist?
>         if (gdev->chip)
>                 ret = func(file, cmd, arg);
>         up_read(&gdev->sem);
>
>         return ret;
>

This is a good suggestion, thanks. And with it, the two patches can
get squashed into one for easier backporting.

> or similar. Not with that odd "try to lock, and if that fails, assume error".
>
> And again - if the trylock is there because 'gdev' itself might go
> away at any time and you can't afford to wait on the lock, then it's
> broken regardless (and the above suggestion won't help either)
>
> Anyway: the end result of this all is that I think this is a
> fundamental bug in the gpio layer, and rc7 (soon to be rc8) is too
> late to try these kinds of unfinished games.
>
> Fix it properly for 6.2, and make it back-portable, because I'm not
> pulling something like this right now.
>
>                Linus

Will do. I will still resend the PR with only the resource leak fixes.

Bartosz



[Index of Archives]     [Linux SPI]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux