On 12/12/20 10:25 AM, Al Viro wrote: > On Sat, Dec 12, 2020 at 09:51:04AM -0700, Jens Axboe wrote: > >> struct dentry *dentry; >> @@ -3164,17 +3165,38 @@ static const char *open_last_lookups(struct nameidata *nd, >> } >> >> if (open_flag & (O_CREAT | O_TRUNC | O_WRONLY | O_RDWR)) { >> - got_write = !mnt_want_write(nd->path.mnt); >> + if (nonblock) { >> + got_write = !mnt_want_write_trylock(nd->path.mnt); >> + if (!got_write) >> + return ERR_PTR(-EAGAIN); >> + } else { >> + got_write = !mnt_want_write(nd->path.mnt); >> + } >> /* >> * do _not_ fail yet - we might not need that or fail with >> * a different error; let lookup_open() decide; we'll be >> * dropping this one anyway. >> */ > > Read the comment immediately after the place you are modifying. Really. > To elaborate: consider e.g. the case when /mnt/foo is a symlink to /tmp/bar, > /mnt is mounted r/o and you are asking to open /mnt/foo for write. We get > to /mnt, call open_last_lookups() to resolve the last component ("foo") in > it. And find that the sucker happens to be an absolute symlink, so we need > to jump into root and resolve "tmp/bar" staring from there. Which is what > the loop in the caller is about. Eventually we'll get to /tmp and call > open_last_lookups() to resolve "bar" there. /tmp needs to be mounted > writable; /mnt does not. > > Sure, you bail out only in nonblock case, so normally the next time around > it'll go sanely. But you are making the damn thing (and it's still too > convoluted, even after a lot of massage towards sanity) harder to reason > about. Thanks - I did read that comment, but reading your full explanation it's starting to make sense. I'll have it follow the same paradigm for the nonblocking side. >> + if (open_flag & O_CREAT) { >> + if (nonblock) { >> + if (!inode_trylock(dir->d_inode)) { >> + dentry = ERR_PTR(-EAGAIN); >> + goto drop_write; >> + } >> + } else { >> + inode_lock(dir->d_inode); >> + } >> + } else { >> + if (nonblock) { >> + if (!inode_trylock_shared(dir->d_inode)) { >> + dentry = ERR_PTR(-EAGAIN); >> + goto drop_write; >> + } >> + } else { >> + inode_lock_shared(dir->d_inode); >> + } >> + } >> dentry = lookup_open(nd, file, op, got_write); > > ... as well as more bloated, with no obvious benefits (take a look > at lookup_open()). Can you elaborate? It's hard not to make if (nonblock) { trylock; } else lock; } not look/feel bloated, as it tends to explode it like the above. Can hide it in helpers, but really, the logic needs to look like that... -- Jens Axboe