Hi Christian, On Sat, Feb 24, 2024 at 08:15:53PM +0100, Christian Brauner wrote: > On Sat, Feb 24, 2024 at 10:48:11AM -0800, Linus Torvalds wrote: > > On Fri, 23 Feb 2024 at 21:52, Christian Brauner <brauner@xxxxxxxxxx> wrote: > > > > > > This is selinux. So I think this is a misunderstanding. This isn't > > > something we can fix in the kernel. > > > > Sure it is. SELinux just goes by what the kernel tells it anyway. > > > > Presumably this is purely about the fact that the inode in question > > *used* to be that magical 'anon_inode_inode' that is shared when you > > don't want or need a separate inode allocation. I assume it doesn't > > even look at that, it just looks at the 'anon_inode_fs_type' thing (or > > maybe at the anon_inode_mnt->mnt_sb that is created by kern_mount in > > anon_inode_init?) > > > > IOW, isn't the *only* difference that selinux can actually see just > > the inode allocation? It used to be that > > > > inode = anon_inode_getfile(); > > > > now it is > > > > inode = new_inode_pseudo(pidfdfs_sb); > > > > and instead of sharing one single inode (like anon_inode_getfile() > > does unless you ask for separate inodes), it now shares the dentry > > instead (for the same pid). > > > > Would selinux be happy if the inode allocation just used the > > anon_inode superblock instead of pidfdfs_sb? > > No, unfortunately not. The core issue is that anon_inode_getfile() isn't > subject to any LSM hooks which is what pidfds used. But dentry_open() is > via security_file_open(). LSMs wanted to have a say in pidfd mediation > which is now possible. So the switch to dentry_open() is what is causing > the issue. > > But here's a straightforward fix appended. We let pidfs.c use that fix > as and then we introduce a new LSM hook for pidfds that allows mediation > of pidfds and selinux can implement it when they're ready. This is > regression free and future proof. I actually tested this already today. > > How does that sounds? I see a patch similar to this change in your vfs.pidfs branch as commit 47a1fbce74c3 ("pidfs: convert to path_from_stashed() helper"), which also appears to be in next-20240227. However, I still seem to be having similar issues (although I cannot reproduce them every single boot like I used to). I do see some SELinux denials for pidfs, although it seems like it is only write that is being denied, rather than open, read, and write? # uname -r 6.8.0-rc6-next-20240227 # systemctl --failed --no-legend --plain fwupd-refresh.service loaded failed failed Refresh fwupd metadata and update motd mcelog.service loaded failed failed Machine Check Exception Logging Daemon polkit.service loaded failed failed Authorization Manager # journalctl -b 0 -g denied -t audit | head -3 Feb 27 10:49:20 qemu audit[1]: AVC avc: denied { write } for pid=1 comm="systemd" path="pidfd:[1547]" dev="pidfs" ino=1547 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=0 Feb 27 10:49:21 qemu audit[1]: AVC avc: denied { write } for pid=1 comm="systemd" path="pidfd:[1564]" dev="pidfs" ino=1564 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=0 Feb 27 10:50:50 qemu audit[1]: AVC avc: denied { write } for pid=1 comm="systemd" path="pidfd:[1547]" dev="pidfs" ino=1547 scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=0 Cheers, Nathan