Re: [PATCH 2/2] pidfd: add pidfdfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 18, 2024 at 12:15:02PM +0100, Christian Brauner wrote:
> On Sat, Feb 17, 2024 at 09:30:19AM -0800, Linus Torvalds wrote:
> > On Sat, 17 Feb 2024 at 06:00, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> > >
> > > But I have a really stupid (I know nothing about vfs) question, why do we
> > > need pidfdfs_ino and pid->ino ? Can you explain why pidfdfs_alloc_file()
> > > can't simply use, say, iget_locked(pidfdfs_sb, (unsigned long)pid) ?
> > >
> > > IIUC, if this pid is freed and then another "struct pid" has the same address
> > > we can rely on __wait_on_freeing_inode() ?
> > 
> > Heh. Maybe it would work, but we really don't want to expose core
> > kernel pointers to user space as the inode number.
> 
> And then also the property that the inode number is unique for the
> system lifetime is extremely useful for userspace and I would like to
> retain that property.
> 
> > 
> > So then we'd have to add extra hackery to that (ie we'd have to
> > intercept stat calls, and we'd have to have something else for
> > ->d_dname() etc..).
> > 
> > Those are all things that the VFS does support, but ...
> > 
> > So I do prefer Christian's new approach, although some of it ends up
> > being a bit unclear.
> > 
> > Christian, can you explain why this:
> > 
> >         spin_lock(&alias->d_lock);
> >         dget_dlock(alias);
> >         spin_unlock(&alias->d_lock);
> > 
> > instead of just 'dget()'?
> 
> No reason other than I forgot to switch to dget().
> 
> > 
> > Also, while I found the old __ns_get_path() to be fairly disgusting, I
> > actually think it's simpler and clearer than playing around with the
> > dentry alias list. So my expectation on code sharing was that you'd
> 
> It's overall probably also cheaper, I think.
> 
> > basically lift the old __ns_get_path(), make *that* the helper, and
> > just pass it an argument that is the pointer to the filesystem
> > "stashed" entry...
> > 
> > And yes, using "atomic_long_t" for stashed is a crime against
> > humanity. It's also entirely pointless. There are no actual atomic
> > operations that the code wants except for reading and writing (aka
> > READ_ONCE() and WRITE_ONCE()) and cmpxchg (aka just cmpxchg()). Using
> > "atomic_long_t" buys the code nothing, and only makes things more
> > complicated and requires crazy casts.
> 
> Yup, I had that as a draft and that introduced struct ino_stash which
> contained a dentry pointer and the inode number using cmpxchg(). But I
> decided against this because ns_common.h would require to have access to
> ino_stash definition so we wouldn't just able to hide it in internal.h
> where it should belong.

Right, I remember. The annoying thing will be how to cleanly handle this
without having to pass too many parameters because we need d_fsdata, the
vfsmount, and the inode->i_fop. So let me see if I can get this to
something that doesn't look too ugly.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux