On Fri, Apr 21, 2023 at 7:02 AM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > This contains Jens' work to support FMODE_NOWAIT and thus IOCB_NOWAIT > for pipes ensuring that all places can deal with non-blocking requests. > > To this end, pass down the information that this is a nonblocking > request so that pipe locking, allocation, and buffer checking correctly > deal with those. Ok, I pulled this, but then I unpulled it again. Doing conditional locking for O_NONBLOCK and friends is not ok. Yes, it's been done, and I may even have let some slip through, but it's just WRONG. There is absolutely no situation where a "ok, so the lock on this data structure was taken, we'll go to some async model" is worth it. Every single time I've seen this, it's been some developer who thinks that O_NONBLOCk is somehow some absolute "this cannot schedule" thing. And every single time it's been broken and horrid crap that just made everything more complicated and slowed things down. If some lock wait is a real problem, then the lock needs to be just fixed. Not "ok, let's make a non-blocking version and fall back if it's held". Note that FMODE_NOWAIT does not mean (and *CANNOT* mean) that you'd somehow be able to do the IO in some atomic context anyway. Many of our kernel locks don't even support that (eg mutexes). So thinking that FMODE_NOWAIT is that kind of absolute is the wrong kind of thinking entirely. FMODE_NOWAIT should mean that no *IO* gets done. And yes, that might mean that allocations fail too. But not this kind of "let's turn locking into 'trylock' stuff". The whoe flag is misnamed. It should have been FMODE_NOIO, the same way we have IOCB_NOIO. If you want FMODE_ATOMIC, then that is something entirely and completely different, and is probably crazy. We have done it in one area (RCU pathname lookup), and it was worth it there, and it was a *huge* undertaking. It was worth it, but it was worth it because it was a serious thing with some serious design and a critical area. Not this kind of "conditional trylock" garbage which just means that people will require 'poll()' to now add the lock to the waitqueue, or make all callers go into some "let's use a different thread instead" logic. Linus