On 12/10/20 7:45 PM, Al Viro wrote: > On Thu, Dec 10, 2020 at 02:06:39PM -0700, Jens Axboe wrote: >> On 12/10/20 1:53 PM, Linus Torvalds wrote: >>> On Thu, Dec 10, 2020 at 12:01 PM Jens Axboe <axboe@xxxxxxxxx> wrote: >>>> >>>> io_uring always punts opens to async context, since there's no control >>>> over whether the lookup blocks or not. Add LOOKUP_NONBLOCK to support >>>> just doing the fast RCU based lookups, which we know will not block. If >>>> we can do a cached path resolution of the filename, then we don't have >>>> to always punt lookups for a worker. >>> >>> Ok, this looks much better to me just from the name change. >>> >>> Half of the patch is admittedly just to make sure it now returns the >>> right error from unlazy_walk (rather than knowing it's always >>> -ECHILD), and that could be its own thing, but I'm not sure it's even >>> worth splitting up. The only reason to do it would be to perhaps make >>> it really clear which part is the actual change, and which is just >>> that error handling cleanup. >>> >>> So it looks fine to me, but I will leave this all to Al. >> >> I did consider doing a prep patch just making the error handling clearer >> and get rid of the -ECHILD assumption, since it's pretty odd and not >> even something I'd expect to see in there. Al, do you want a prep patch >> to do that to make the change simpler/cleaner? > > No, I do not. Why bother returning anything other than -ECHILD, when > you can just have path_init() treat you flag sans LOOKUP_RCU as "fail > with -EAGAIN now" and be done with that? > > What's the point propagating that thing when we are going to call the > non-RCU variant next if we get -ECHILD? Let's at least make it consistent - there is already one spot in there that passes the return value back (see below). > And that still doesn't answer the questions about the difference between > ->d_revalidate() and ->get_link() (for the latter you keep the call in > RCU mode, for the former you generate that -EAGAIN crap). Or between > ->d_revalidate() and ->permission(), for that matter. I believe these are moot with the updated patch from the other email. > Finally, I really wonder what is that for; if you are in conditions when > you really don't want to risk going to sleep, you do *NOT* want to > do mnt_want_write(). Or ->open(). Or truncate(). Or, for Cthulhu > sake, IMA hash calculation. I just want to do the RCU side lookup, that is all. That's my fast path. If that doesn't work, then we'll go through the motions of pushing this to a context that allows blocking open. > So how hard are your "we don't want to block here" requirements? Because > the stuff you do after complete_walk() can easily be much longer than > everything else. Ideally it'd extend a bit beyond the RCU lookup, as things like proc resolution will still fail with the proposed patch. But that's not a huge deal to me, I consider the dentry lookup to be Good Enough. commit bbfc4b98da8c5d9a64ae202952aa52ae6bb54dbd Author: Jens Axboe <axboe@xxxxxxxxx> Date: Thu Dec 10 14:10:37 2020 -0700 fs: make unlazy_walk() error handling consistent Most callers check for non-zero return, and assume it's -ECHILD (which it always will be). One caller uses the actual error return. Clean this up and make it fully consistent, by having unlazy_walk() return a bool instead. No functional changes in this patch. Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> diff --git a/fs/namei.c b/fs/namei.c index 03d0e11e4f36..d7952f863e79 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -679,7 +679,7 @@ static bool legitimize_root(struct nameidata *nd) * Nothing should touch nameidata between unlazy_walk() failure and * terminate_walk(). */ -static int unlazy_walk(struct nameidata *nd) +static bool unlazy_walk(struct nameidata *nd) { struct dentry *parent = nd->path.dentry; @@ -694,14 +694,14 @@ static int unlazy_walk(struct nameidata *nd) goto out; rcu_read_unlock(); BUG_ON(nd->inode != parent->d_inode); - return 0; + return false; out1: nd->path.mnt = NULL; nd->path.dentry = NULL; out: rcu_read_unlock(); - return -ECHILD; + return true; } /** @@ -3151,9 +3151,8 @@ static const char *open_last_lookups(struct nameidata *nd, } else { /* create side of things */ if (nd->flags & LOOKUP_RCU) { - error = unlazy_walk(nd); - if (unlikely(error)) - return ERR_PTR(error); + if (unlazy_walk(nd)) + return ERR_PTR(-ECHILD); } audit_inode(nd->name, dir, AUDIT_INODE_PARENT); /* trailing slashes? */ -- Jens Axboe