Re: [PATCH RFC 0/6] proc: restrict overmounting of ephemeral entities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 06, 2024 at 06:02:26PM +0200, Christian Brauner wrote:
> (Preface because I've been panick-approached by people at conference
>  when we discussed this before: overmounting any global procfs files
>  such as /proc/status remains unaffected and is an existing and
>  supported use-case.)
> 
> It is currently possible to mount on top of various ephemeral entities
> in procfs. This specifically includes magic links. To recap, magic links
> are links of the form /proc/<pid>/fd/<nr>. They serve as references to
> a target file and during path lookup they cause a jump to the target
> path. Such magic links disappear if the corresponding file descriptor is
> closed.
> 
> Currently it is possible to overmount such magic links:
> 
> int fd = open("/mnt/foo", O_RDONLY);
> sprintf(path, "/proc/%d/fd/%d", getpid(), fd);
> int fd2 = openat(AT_FDCWD, path, O_PATH | O_NOFOLLOW);
> mount("/mnt/bar", path, "", MS_BIND, 0);
> 
> Arguably, this is nonsensical and is mostly interesting for an attacker
> that wants to somehow trick a process into e.g., reopening something
> that they didn't intend to reopen or to hide a malicious file
> descriptor.
> 
> But also it risks leaking mounts for long-running processes. When
> overmounting a magic link like above, the mount will not be detached
> when the file descriptor is closed. Only the target mountpoint will
> disappear. Which has the consequence of making it impossible to unmount
> that mount afterwards. So the mount will stick around until the process
> exits and the /proc/<pid>/ directory is cleaned up during
> proc_flush_pid() when the dentries are pruned and invalidated.
> 
> That in turn means it's possible for a program to accidentally leak
> mounts and it's also possible to make a task leak mounts without it's
> knowledge if the attacker just keeps overmounting things under
> /proc/<pid>/fd/<nr>.
> 
> I think it's wrong to try and fix this by us starting to play games with
> close() or somewhere else to undo these mounts when the file descriptor
> is closed. The fact that we allow overmounting of such magic links is
> simply a bug and one that we need to fix.
> 
> Similar things can be said about entries under fdinfo/ and map_files/ so
> those are restricted as well.
> 
> I have a further more aggressive patch that gets out the big hammer and
> makes everything under /proc/<pid>/*, as well as immediate symlinks such
> as /proc/self, /proc/thread-self, /proc/mounts, /proc/net that point
> into /proc/<pid>/ not overmountable. Imho, all of this should be blocked
> if we can get away with it. It's only useful to hide exploits such as in [1].
> 
> And again, overmounting of any global procfs files remains unaffected
> and is an existing and supported use-case.
> 
> Link: https://righteousit.com/2024/07/24/hiding-linux-processes-with-bind-mounts [1]
> 
> // Note that repro uses the traditional way of just mounting over
> // /proc/<pid>/fd/<nr>. This could also all be achieved just based on
> // file descriptors using move_mount(). So /proc/<pid>/fd/<nr> isn't the
> // only entry vector here. It's also possible to e.g., mount directly
> // onto /proc/<pid>/map_files/* without going over /proc/<pid>/fd/<nr>.
> int main(int argc, char *argv[])
> {
>         char path[PATH_MAX];
> 
>         creat("/mnt/foo", 0777);
>         creat("/mnt/bar", 0777);
> 
>         /*
>          * For illustration use a bunch of file descriptors in the upper
>          * range that are unused.
>          */
>         for (int i = 10000; i >= 256; i--) {
>                 printf("I'm: /proc/%d/\n", getpid());
> 
>                 int fd2 = open("/mnt/foo", O_RDONLY);
>                 if (fd2 < 0) {
>                         printf("%m - Failed to open\n");
>                         _exit(1);
>                 }
> 
>                 int newfd = dup2(fd2, i);
>                 if (newfd < 0) {
>                         printf("%m - Failed to dup\n");
>                         _exit(1);
>                 }
>                 close(fd2);
> 
>                 sprintf(path, "/proc/%d/fd/%d", getpid(), newfd);
>                 int fd = openat(AT_FDCWD, path, O_PATH | O_NOFOLLOW);
>                 if (fd < 0) {
>                         printf("%m - Failed to open\n");
>                         _exit(3);
>                 }
> 
>                 sprintf(path, "/proc/%d/fd/%d", getpid(), fd);
>                 printf("Mounting on top of %s\n", path);
>                 if (mount("/mnt/bar", path, "", MS_BIND, 0)) {
>                         printf("%m - Failed to mount\n");
>                         _exit(4);
>                 }
> 
>                 close(newfd);
>                 close(fd2);
>         }
> 
>         /*
>          * Give some time to look at things. The mounts now linger until
>          * the process exits.
>          */
>         sleep(10000);
>         _exit(0);
> }
> 
> Co-developed-by: Aleksa Sarai <cyphar@xxxxxxxxxx>
> Signed-off-by: Aleksa Sarai <cyphar@xxxxxxxxxx>
> Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>

I'm always down to restrict /proc, you can add

Reviewed-by: Josef Bacik <josef@xxxxxxxxxxxxxx>

Thanks,

Josef




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux