On Fri, Jan 17, 2020 at 09:28:55AM -0800, Omar Sandoval wrote: > On Fri, Jan 17, 2020 at 04:59:04PM +0000, Al Viro wrote: > > On Fri, Jan 17, 2020 at 08:36:16AM -0800, Omar Sandoval wrote: > > > > > The semantics I implemented in my series were basically "linkat with > > > AT_REPLACE replaces the target iff rename would replace the target". > > > Therefore, symlinks are replaced, not followed, and mountpoints get > > > EXDEV. In my opinion that's both sane and unsurprising. > > > > Umm... EXDEV in rename() comes when _parents_ are on different mounts. > > rename() over a mountpoint is EBUSY if it has mounts in caller's > > namespace, but it succeeds (and detaches all mounts on the victim > > in any namespaces) otherwise. > > > > When are you returning EXDEV? > > EXDEV was a thinko, the patch does what rename does: > > > + if (is_local_mountpoint(new_dentry)) { > + error = -EBUSY; > + goto out; > + } > > ... > > + if (target) { > + dont_mount(new_dentry); > + detach_mounts(new_dentry); > + } > > Anyways, my point is that the rename semantics cover 90% of AT_REPLACE. > Before I resend the patches, I'll write up the documentation and we can > see what other corner cases I missed. OK... rename() has a major difference from linkat(), though, in not following links (or allowing fd + empty path). link() is deeply asymmetric in treatment of pathnames - the first argument is "pathname describes a filesystem object" and the second - "pathname descripes an entry in a directory" (the link to be). rename() is not - both arguments are pathnames-as-link-specifiers. And that affects what is and what is not allowed there, so that'll need a careful look into. FWIW, currently linkat() semantics can be described simply enough * oldfd/oldname specifies an fs object; symlink traversal is optional. * newfd/newname specifies an entry in some directory. Directory must be on the same mount as the object specified by oldname. Entry must be a normal component (no empty paths allowed, no . or .. either). Trailing slashes are not allowed. There must be no entry with that name (which automatically implies that trailing symlinks are not to be followed and there can't be anything mounted on it). * the object specified by oldname must be a non-directory. * if the object is not a never-linked-yet anonymous, it must still have some links. * caller must have permission to create links in the affected directory. * append-only and immutables are not allowed (rationale: they can't be unlinked) * filesystem is allowed to fail for any reasons, as with any operation; a linkat()-specific one is having the link count overflow, but any generic error is possible (out of memory, IO error, EPERM-because-I-feel-like-that, etc.) All checks related to object in question are atomic wrt operations that add or remove links to that object. Checks on parent are atomic wrt operations modifying the parent. Neither group is atomic wrt operations modifying the _old_ parent (if there's any, in the first place).