Apologies, this message was intended as a reply to Al Viro, but I accidentally deleted that message so I'm replying to the reply instead. On Mon, Oct 09 2023 at 16:14:27 +0100, Luis Henriques quoth thus: > Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes: > > > On Fri, Sep 29, 2023 at 03:04:56PM +0100, Luis Henriques wrote: > > > >> -int do_unlinkat(int dfd, struct filename *name) > >> +int do_unlinkat(int dfd, struct filename *name, int flags) > >> { > >> int error; > >> - struct dentry *dentry; > >> + struct dentry *dentry, *parent; > >> struct path path; > >> struct qstr last; > >> int type; > >> struct inode *inode = NULL; > >> struct inode *delegated_inode = NULL; > >> unsigned int lookup_flags = 0; > >> -retry: > >> - error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type); > >> - if (error) > >> - goto exit1; > >> + bool empty_path = (flags & AT_EMPTY_PATH); > >> > >> - error = -EISDIR; > >> - if (type != LAST_NORM) > >> - goto exit2; > >> +retry: > >> + if (empty_path) { > >> + error = filename_lookup(dfd, name, 0, &path, NULL); > >> + if (error) > >> + goto exit1; > >> + parent = path.dentry->d_parent; > >> + dentry = path.dentry; > >> + } else { > >> + error = filename_parentat(dfd, name, lookup_flags, &path, &last, &type); > >> + if (error) > >> + goto exit1; > >> + error = -EISDIR; > >> + if (type != LAST_NORM) > >> + goto exit2; > >> + parent = path.dentry; > >> + } > >> > >> error = mnt_want_write(path.mnt); > >> if (error) > >> goto exit2; > >> retry_deleg: > >> - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); > >> - dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags); > >> + inode_lock_nested(parent->d_inode, I_MUTEX_PARENT); > >> + if (!empty_path) > >> + dentry = lookup_one_qstr_excl(&last, parent, lookup_flags); > > > > For starters, your 'parent' might have been freed under you, just as you'd > > been trying to lock its inode. Or it could have become negative just as you'd > > been fetching its ->d_inode, while we are at it. > > > > Races aside, you are changing permissions required for removing files. For > > unlink() you need to be able to get to the parent directory; if it's e.g. > > outside of your namespace, you can't do anything to it. If file had been > > opened there by somebody who could reach it and passed to you (via SCM_RIGHTS, > > for example) you currently can't remove the sucker. With this change that > > is no longer true. > > > > The same goes for the situation when file is bound into your namespace (or > > chroot jail, for that matter). path.dentry might very well be equal to > > root of path.mnt; path.dentry->d_parent might be in part of tree that is > > no longer visible *anywhere*. rmdir() should not be able to do anything > > with it... > > > > IMO it's fundamentally broken; not just implementation, but the concept > > itself. > > > > NAKed-by: Al Viro <viro@xxxxxxxxxxxxxxxxxx> Al, thank you for this information. It does shine a little light on where dragons may be hiding. I was wondering if you could comment on a related issue. linkat does allow specifing AT_EMPTY_PATH. However, it requires CAP_DAC_READ_SEARCH. I saw that a patch went into the kernel to remove this restriction, but was shortly thereafter reverted with a comment to the effect of "We'll have to think about this a little more". Then, radio silence. Other than requiring /proc be mounted to bypass, what problem does this restriction solve? Also related, the thing I'm even more interested in is the ability to create an O_TMPFILE, populate it, set permissions, etc, and then make it appear in a directory. The problem is I almost always don't want it to just appear, but rather atomically replace an older version of the file. dfd = openat(x, "y", O_RDONLY | O_CLOEXEC | O_DIRECTORY, 0) fd = openat(dfd, ".", O_RDWR | O_CLOEXEC | O_TMPFILE, 0600) do stuff with fd fsync(fd) linkat(fd, "", dfd, "z", AT_EMPTY_PATH | AT_REPLACE?) close(fd) fsync(dfd) close(dfd) The AT_REPLACE flag has been suggested before to work around the EEXIST behavior. Alternatively, renameat2 could accept AT_EMPTY_PATH for olddirfd/oldpath, but fixing up linkat seems a little cleaner. Without this, it hardly seems worthwhile to use O_TMPFILE at all, and instead just go through the hassle of creating the file with a random name (plus exposing that file and having to possibly rm it in case of an error). I haven't been able to find any explanation for the AT_REPLACE idea not gaining traction. Is there some security reason for this? Thanks > Thank you for your review, which made me glad I sent out the patch early > as an RFC. I (think I) understand the issues you pointed out and, > although some of them could be fixed (the races), I guess there's no point > pursuing this any further, since you consider the concept itself to be > broken. Again, thank you for your time. > > Cheers, > -- > Luís