Re: [PATCH 8/8] NFSD: Instantiate a struct file when creating a regular NFSv4 file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 13, 2022 at 10:20:18PM +0000, Al Viro wrote:

> Yuck.  dget_parent() is not entirely without valid uses, but this isn't
> one.  It's for the cases when parent is *not* stable and you need to grab
> what had been the parent at some point (even though it might not be the
> parent anymore by the time dget_parent() returns).  Here you seriously
> depend upon it remaining the parent of that sucker all the way through -
> otherwise vfs_create() would break.  And you really, really depend upon
> its survival - the caller is holding it locked, so they would better
> have it pinned.

As an aside, the reason why vfs_create() takes inode of parent directory
and dentry of child is basically that it's easier to describe the locking
rules that way: vfs_create(..., dir, child, ...) must be called with
1) dir being held by caller (exclusive) and
2) child->d_parent->d_inode == dir, which is stabilized by (1)

inode of parent directory is a redundant argument - it can be easily
derived from the child dentry, for all that family.  The only real
objection against dropping it from vfs_create() and friends is that
having rules described as "inode of parent dentry of child must be held
exclusive by the caller" invites breakage along the lines of

	parent = dget_parent(child);
	inode_lock(d_inode(parent));
	vfs_create(..., child, ...);	// WRONG
	inode_unlock(d_inode(parent));
	dput(parent);

which *seems* to match the rules, but actually breaks them badly -
'parent' in the snippet above might be no longer related to child by the
time dget_parent() returns it, so we end up calling vfs_create() with
wrong directory locked, child->d_parent being completely unstable, etc.
Note that the difference from your code (which is correct, if redundant) is
rather subtle.

If you have any suggestions how to describe these locking rules without
an explicit inode-of-parent argument, I would really like to hear those.
The best I'd been able to come up with had been "there's an inode
locked exclusive by the caller that had been observed to be equal to
child->d_parent->d_inode at some point after it had been locked", which
is both cumbersome and confusing...



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux