On Tue, Jan 11, 2022 at 09:12:08AM -0800, Linus Torvalds wrote: > On Mon, Jan 10, 2022 at 10:19 AM Jan Kara <jack@xxxxxxx> wrote: > > > > A task can end up indefinitely sleeping in do_select() -> > > poll_schedule_timeout() when the following race happens: > > {...] > > Ok, I decided to just take this as-is right now, and get it in early > in the merge window, and see if anybody hollers. > > I don't think the stable people will try to apply it until after the > merge window closes anyway, but it's worth pointing out that this > change (commit 68514dacf271: "select: Fix indefinitely sleeping task > in poll_schedule_timeout()" in my tree now) is very much a change of > behavior, and we may have to revert it if it causes any issues. > > The most likely issue it would cause is that some program uses > select() with an fd mask with extra garbage in it, and stale fd bits > that pointed to closed file descriptors used to just be ignored. Now > they'll cause select() to return immediately with those bits set. > > And that might then cause a program to perhaps still work, but > busy-spin on select(), wasting CPU time. Or it will walk the result > bits, see them set, try to read/write to them, get EBADF, and clear > them. Or not clear them and just be very unhappy indeed. > > So while I think this version of the patch is still safer than the > EBADF one - and I think better semantics that happen to match poll() > too - I think this is a patch that could expose existing bad user > space. > > We'll see. I considered adding a WARN_ON_ONCE() just to make the > change in behavior more visible, but ended up not really feeling it. > > End result: I took this patch eagerly not because I was happy to do > it, but simply because the earlier we test this, the earlier we'll > know of any problems. Let's hope there are none. Thanks for the heads-up for the stable tree relevance, I'll watch out for this one. greg k-h