On 1/12/20 11:32 AM, Al Viro wrote: > On Mon, Jan 06, 2020 at 09:04:01PM -0700, Jens Axboe wrote: >> On 1/5/20 6:30 PM, Stephen Rothwell wrote: >>> Hi all, >>> >>> After merging the block tree, today's linux-next build (arm >>> multi_v7_defconfig) failed like this: >>> >>> fs/open.c:977:12: error: conflicting types for 'build_open_flags' >>> 977 | inline int build_open_flags(const struct open_how *how, >>> | ^~~~~~~~~~~~~~~~ >>> In file included from /home/sfr/next/next/fs/open.c:36: >>> fs/internal.h:127:12: note: previous declaration of 'build_open_flags' was here >>> 127 | extern int build_open_flags(int flags, umode_t mode, struct open_flags *op); >>> | ^~~~~~~~~~~~~~~~ >>> >>> Caused by commits >>> >>> 4e9e15c9426e ("fs: make build_open_flags() available internally") >>> 3bba3e571bc8 ("io_uring: add support for IORING_OP_OPENAT") >>> >>> interacting with commit >>> >>> 0a51692d49ec ("open: introduce openat2(2) syscall") >>> >>> from the vfs tree. >>> >>> I have applied the following fix up patch for today: >> >> Thanks Stephen - I'll pull in the VFS tree and rebase the 5.6 io_uring >> bits on that. Then I'll send it out for review again, haven't heard from >> Al on the non-block open change. > > FWIW, I don't believe that your approach is workable. First of all, > *ANY* transition out of RCU mode can lead to blocking. You need to > acquire several references (mount and dentry, at the very least). > Suppose the last one fails (->d_seq mismatch). Now you suddenly > have to drop the one(s) you've acquired. And both dput() and mntput() > are fundamentally blocking operations. > > It simply does not work. You could cobble up something that kinda-sorta > works, if your added flag had > * caused hard failure on unlazy_child() > * caused hard failure on unlazy_walk() with any symlinks in stack > * caused hard failure on unlazy_walk() if it would've been required > to grab root > * made unlazy_walk() go through very careful dance if it's just > about nd->path; I'm not sure how well that could be done, but theoretically > it's not impossible. > > But for open() it's not going to work at all. Any open for write => you > will have to wait if you run into fs freeze. O_TRUNC => you've got IO > to do. Worst of all, once you've dropped out of RCU mode, *YOU* *CAN'T* > *FAIL*. Because that means blocking operations. So you need to verify > that you won't run into a blocking ->open(), IMA deciding to play silly > buggers and read through the entire file, etc., etc. _before_ dropping > out of RCU mode. > > do_last() is messy enough as it is; adding _this_ is completely out of > question. Thanks Al, that's useful! Sounds like the lookup is doable, but the open part is just a wasp nest of "don't even go there". For now, I'll drop the lookup change and just have the io_uring open punt to async. With that, I don't need any non-blocking guarantees. That is workable for now. > Jens, if you have a workable plan on that non-blocking open of yours, > post it in full details. Until then - NAK, and that's about as hard one > as I ever had to give. It's like the other io_uring opcodes - we prefer to try a non-blocking attempt first, and if that fails, then we go async. I have no grand async open design, was just hoping I could make it work with minimal effort. That's obviously not doable. I would not mind working on actually making it doable, but that's a bigger project than I originally wanted to take on. So the most likely outcome longer term is for io_uring to adopt a syslet type of approach to this, where we always just just call the open helper, and if we need to block/reschedule, then we move context to an appropriate worker thread. Time is better spent there rather than trying to make every useful system call provide a sane non-blocking path, I think. -- Jens Axboe