On Fri, 2024-11-29 at 14:02 +0100, Christian Brauner wrote: > Hey, > > This reworks the inode number allocation for pidfs in order to support > file handles properly. > > Recently we received a patchset that aims to enable file handle encoding > and decoding via name_to_handle_at(2) and open_by_handle_at(2). > > A crucical step in the patch series is how to go from inode number to > struct pid without leaking information into unprivileged contexts. The > issue is that in order to find a struct pid the pid number in the > initial pid namespace must be encoded into the file handle via > name_to_handle_at(2). This can be used by containers using a separate > pid namespace to learn what the pid number of a given process in the > initial pid namespace is. While this is a weak information leak it could > be used in various exploits and in general is an ugly wart in the > design. > > To solve this problem a new way is needed to lookup a struct pid based > on the inode number allocated for that struct pid. The other part is to > remove the custom inode number allocation on 32bit systems that is also > an ugly wart that should go away. > > So, a new scheme is used that I was discusssing with Tejun some time > back. A cyclic ida is used for the lower 32 bits and a the high 32 bits > are used for the generation number. This gives a 64 bit inode number > that is unique on both 32 bit and 64 bit. The lower 32 bit number is > recycled slowly and can be used to lookup struct pids. > > Thanks! > Christian > > --- > Changes in v2: > - Remove __maybe_unused pidfd_ino_get_pid() function that was only there > for initial illustration purposes. > - Link to v1: https://lore.kernel.org/r/20241128-work-pidfs-v1-0-80f267639d98@xxxxxxxxxx > > --- > Christian Brauner (3): > pidfs: rework inode number allocation > pidfs: remove 32bit inode number handling > pidfs: support FS_IOC_GETVERSION > > fs/pidfs.c | 118 ++++++++++++++++++++++++++++++++------------------ > include/linux/pidfs.h | 2 + > kernel/pid.c | 14 +++--- > 3 files changed, 86 insertions(+), 48 deletions(-) > --- > base-commit: b86545e02e8c22fb89218f29d381fa8e8b91d815 > change-id: 20241128-work-pidfs-2bd42c7ea772 > > This seems like a good stopgap fix until we can sort out how to get to 64-bit inode numbers internally everywhere. Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>