On Sat, Dec 3, 2022 at 7:05 AM Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > > Here's a fixed PR from the GPIO subsystem for the next rc. No, this cannot be right. That last commit seems *very* dubious, and in particular all those if (!down_read_trylock(&gdev->sem)) return EPOLLHUP | EPOLLERR; are a sign that something is very very wrong there. Either the lock is necessary or it isn't, and "trylock" isn't the way to deal with it, with random failures if you cannot take the lock. If you are using "trylock" because the data structure might go away from under you, you have already lost, and the code is buggy. And if the data structure cannot go away from under you, you should do an unconditional lock, and then check "gdev->chip" for being NULL once you have gotten the lock (the same way you did in open()). But a "trylock and return error if it failed" just means that now you are randomly returning errors to user space, which is entirely undebuggable and makes no sense. Or, alternatively, the trylock succeeds - because it hits fully *after* gpiochip_remove() has finished, and now ->chip is NULL anyway, which is what you claim to protect against. End result: "trylock" can never be right in this kind of context. That "call_locked() helper might make sense more along the lines of ret = -ENODEV; down_read(&gdev->sem)) // Does the device still exist? if (gdev->chip) ret = func(file, cmd, arg); up_read(&gdev->sem); return ret; or similar. Not with that odd "try to lock, and if that fails, assume error". And again - if the trylock is there because 'gdev' itself might go away at any time and you can't afford to wait on the lock, then it's broken regardless (and the above suggestion won't help either) Anyway: the end result of this all is that I think this is a fundamental bug in the gpio layer, and rc7 (soon to be rc8) is too late to try these kinds of unfinished games. Fix it properly for 6.2, and make it back-portable, because I'm not pulling something like this right now. Linus