On 12/10/20 3:29 PM, Dave Chinner wrote: > On Thu, Dec 10, 2020 at 01:01:14PM -0700, Jens Axboe wrote: >> Now that we support non-blocking path resolution internally, expose it >> via openat2() in the struct open_how ->resolve flags. This allows >> applications using openat2() to limit path resolution to the extent that >> it is already cached. >> >> If the lookup cannot be satisfied in a non-blocking manner, openat2(2) >> will return -1/-EAGAIN. >> >> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> >> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> >> --- >> fs/open.c | 2 ++ >> include/linux/fcntl.h | 2 +- >> include/uapi/linux/openat2.h | 2 ++ >> 3 files changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/fs/open.c b/fs/open.c >> index 9af548fb841b..07dc9f3d1628 100644 >> --- a/fs/open.c >> +++ b/fs/open.c >> @@ -1087,6 +1087,8 @@ inline int build_open_flags(const struct open_how *how, struct open_flags *op) >> lookup_flags |= LOOKUP_BENEATH; >> if (how->resolve & RESOLVE_IN_ROOT) >> lookup_flags |= LOOKUP_IN_ROOT; >> + if (how->resolve & RESOLVE_NONBLOCK) >> + lookup_flags |= LOOKUP_NONBLOCK; >> >> op->lookup_flags = lookup_flags; >> return 0; >> diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h >> index 921e750843e6..919a13c9317c 100644 >> --- a/include/linux/fcntl.h >> +++ b/include/linux/fcntl.h >> @@ -19,7 +19,7 @@ >> /* List of all valid flags for the how->resolve argument: */ >> #define VALID_RESOLVE_FLAGS \ >> (RESOLVE_NO_XDEV | RESOLVE_NO_MAGICLINKS | RESOLVE_NO_SYMLINKS | \ >> - RESOLVE_BENEATH | RESOLVE_IN_ROOT) >> + RESOLVE_BENEATH | RESOLVE_IN_ROOT | RESOLVE_NONBLOCK) >> >> /* List of all open_how "versions". */ >> #define OPEN_HOW_SIZE_VER0 24 /* sizeof first published struct */ >> diff --git a/include/uapi/linux/openat2.h b/include/uapi/linux/openat2.h >> index 58b1eb711360..ddbf0796841a 100644 >> --- a/include/uapi/linux/openat2.h >> +++ b/include/uapi/linux/openat2.h >> @@ -35,5 +35,7 @@ struct open_how { >> #define RESOLVE_IN_ROOT 0x10 /* Make all jumps to "/" and ".." >> be scoped inside the dirfd >> (similar to chroot(2)). */ >> +#define RESOLVE_NONBLOCK 0x20 /* Only complete if resolution can be >> + done without IO */ > > I don't think this describes the implementation correctly - it has > nothing to actually do with whether IO is needed, just whether the > lookup can be done without taking blocking locks. The slow path can > complete without doing IO - it might miss the dentry cache but hit > the filesystem buffer cache on lookup and the inode cache when > retrieving the inode. And it may not even block anywhere doing this. > > So, really, this isn't avoiding IO at all - it's avoiding the > possibility of running a lookup path that might blocking on > something. Right, it's about not blocking, as the commit message says. That could be IO, either directly or indirectly, or it could be just locking. I'll update this comment. > This also needs a openat2(2) man page update explaining exactly what > behaviour/semantics this flag provides and that userspace can rely > on when this flag is set... Agree, I'll add that. > We've been failing to define the behaviour of our interfaces clearly, > especially around non-blocking IO behaviour in recent times. We need > to fix that, not make matters worse by adding new, poorly defined > non-blocking behaviours... Also agree on that! It doesn't help that different folks have different criteria of what nowait/nonblock means... > I'd also like to know how we actually test this is working- a > reliable regression test for fstests would be very useful for > ensuring that the behaviour as defined by the man page is not broken > accidentally by future changes... Definitely. On the io_uring side, I generally run with a debug patch that triggers if anything blocks off the submission path. That won't really work for this one, however. FWIW, I'm quite fine deferring this patch, I obviously care more about the io_uring side of things. It seems like a no-brainer to support for openat2 though, as it would allow applications to decide when to punt open to another thread. Since this one ties in very closely with LOOKUP_RCU, I'm not _too_ worried about this one in particular. But it would be great to have something that we could use with eg RWF_NOWAIT as well, and similar constructs. I'll give it some thought. -- Jens Axboe