On 12/15/20 7:36 PM, Al Viro wrote: > On Mon, Dec 14, 2020 at 12:13:22PM -0700, Jens Axboe wrote: >> io_uring always punts opens to async context, since there's no control >> over whether the lookup blocks or not. Add LOOKUP_NONBLOCK to support >> just doing the fast RCU based lookups, which we know will not block. If >> we can do a cached path resolution of the filename, then we don't have >> to always punt lookups for a worker. >> >> We explicitly disallow O_CREAT | O_TRUNC opens, as those will require >> blocking, and O_TMPFILE as that requires filesystem interactions and >> there's currently no way to pass down an attempt to do nonblocking >> operations there. This basically boils down to whether or not we can >> do the fast path of open or not. If we can't, then return -EAGAIN and >> let the caller retry from an appropriate context that can handle >> blocking. >> >> During path resolution, we always do LOOKUP_RCU first. If that fails and >> we terminate LOOKUP_RCU, then fail a LOOKUP_NONBLOCK attempt as well. > > Ho-hum... FWIW, I'm tempted to do the same change of calling > conventions for unlazy_child() (try_to_unlazy_child(), true on > success). OTOH, the call site is right next to removal of > unlikely(status == -ECHILD) suggested a few days ago... > > Mind if I take your first commit + that removal of unlikely + change > of calling conventions for unlazy_child() into #work.namei (based at > 5.10), so that the rest of your series got rebased on top of that? Of course, go ahead. >> @@ -3299,7 +3315,16 @@ static int do_tmpfile(struct nameidata *nd, unsigned flags, >> { >> struct dentry *child; >> struct path path; >> - int error = path_lookupat(nd, flags | LOOKUP_DIRECTORY, &path); >> + int error; >> + >> + /* >> + * We can't guarantee that the fs doesn't block further down, so >> + * just disallow nonblock attempts at O_TMPFILE for now. >> + */ >> + if (flags & LOOKUP_NONBLOCK) >> + return -EAGAIN; > > Not sure I like it here, TBH... This ties in with the later email, so you'd prefer to gate this upfront instead of putting it in here? I'm fine with that. -- Jens Axboe