On 6/10/19 11:00 AM, Stephen Smalley wrote: > On 6/10/19 10:37 AM, Daniel Walsh wrote: >> On 6/10/19 10:08 AM, Stephen Smalley wrote: >>> On 6/8/19 10:08 AM, Daniel Walsh wrote: >>>> On 6/7/19 5:26 PM, Stephen Smalley wrote: >>>>> On 6/7/19 5:06 PM, Daniel Walsh wrote: >>>>>> On 6/7/19 12:44 PM, Stephen Smalley wrote: >>>>>>> On 6/7/19 11:42 AM, Daniel Walsh wrote: >>>>>>>> We have periodic vulnerablities around bad container images having >>>>>>>> symbolic link attacks against the host. >>>>>>>> >>>>>>>> One came out last week about doing a `podman cp` >>>>>>>> >>>>>>>> Which would copy content from the host into the container. The >>>>>>>> issue >>>>>>>> was that if the container was running, it could trick the >>>>>>>> processes >>>>>>>> copying content into it to follow a symbolic link to external >>>>>>>> of the >>>>>>>> container image. >>>>>>>> >>>>>>>> The question came up, is there a way to use SELinux to prevent >>>>>>>> this. And >>>>>>>> sadly the answer right now is no, because we have no way to know >>>>>>>> what >>>>>>>> the label of the process attempting to update the container file >>>>>>>> system >>>>>>>> is running as. Usually it will be running as unconfined_t. >>>>>>>> >>>>>>>> One idea would be to add a rule to policy that control the >>>>>>>> following of >>>>>>>> symbolic links to only those specified in policy. >>>>>>>> >>>>>>>> >>>>>>>> Something like >>>>>>>> >>>>>>>> SPECIALRESTRICTED TYPE container_file_t >>>>>>>> >>>>>>>> allow container_file_t container_file_t:symlink follow; >>>>>>>> >>>>>>>> Then if a process attempted to copy content onto a symbolic link >>>>>>>> from >>>>>>>> container_file_t to a non container_file_t type, the kernel would >>>>>>>> deny >>>>>>>> access. >>>>>>>> >>>>>>>> Thoughts? >>>>>>> >>>>>>> SELinux would prevent it if you didn't allow unconfined_t (or other >>>>>>> privileged domains) to follow untrustworthy symlinks (e.g. don't >>>>>>> allow >>>>>>> unconfined_t container_file_t:lnk_file read; in the first place). >>>>>>> That's the right way to prevent it. >>>>>>> >>>>>>> Trying to apply a check between symlink and its target as you >>>>>>> suggest >>>>>>> is problematic; we don't generally have them both at the same >>>>>>> point. >>>>>>> If we are allowed to follow the symlink, we read its contents and >>>>>>> perform a path walk on that, and that could be a multi-component >>>>>>> pathname lookup that itself spans further symlinks, mount points, >>>>>>> etc. I think that would be challenging to support in the kernel, >>>>>>> subject to races, and certainly would require changes outside of >>>>>>> just >>>>>>> SELinux. >>>>>>> >>>>>>> If you truly cannot impose such restrictions on unconfined_t, then >>>>>>> maybe podman should run in its own domain. >>>>>>> >>>>>> This is not an issue with just podman. Podman can mount the >>>>>> image and >>>>>> the tools can just read/write content into the mountpoint. >>>>>> >>>>>> I thought I recalled a LSM that prefented symlink attacks when users >>>>>> would link a file in the homedir against /etc/shadow and then >>>>>> attempt to >>>>>> get the admin to modify the file in his homedir? >>>>>> >>>>>> I was thinking that if that existed we could build more controls >>>>>> on it >>>>>> based on Labels rather then just UIDs matching. >>>>> >>>>> Not sure if you are thinking of symlink attacks or hard link attacks. >>>>> SELinux supports preventing the former by restricting the ability to >>>>> follow symlinks based on lnk_file read permission, so you can prevent >>>>> trusted processes from following untrustworthy symlinks. SELinux >>>>> supports preventing the latter by restricting the ability to create >>>>> hard links to unauthorized files. But you need to write your >>>>> policies >>>>> in a manner that leverages that support, and a fully unconfined >>>>> domain >>>>> isn't going to be protected via SELinux by definition; ideally you'd >>>>> be phasing out unconfined altogether like Android did. Modern >>>>> kernels >>>>> also have the /proc/sys/fs/protected_hardlinks and >>>>> /proc/sys/fs/protected_symlinks settings, which restrict based on >>>>> UID, >>>>> but the symlink checks aren't based on the target of the symlink >>>>> either. >>>> >>>> Android does not have an Admin, so it is a lot easier for them. >>>> But not >>>> going to get into that now. I obviously understand how SELinux works. >>>> But perhaps I am looking for something differntly. >>>> >>>> This link defines pretty close to what I would want, but extended for >>>> labels rather then just UIDS. >>>> >>>> https://sysctl-explorer.net/fs/protected_symlinks/ >>>> >>>> >>>>> A long-standing class of security issues is the symlink-based >>>>> time-of-check-time-of-use race, most commonly seen in world-writable >>>>> directories like /tmp. The common method of exploitation of this flaw >>>>> is to cross privilege boundaries when following a given symlink (i.e. >>>>> a **PRIVILEGED** process follows a symlink belonging **PROVIDED BY >>>>> OTHERS**). For a likely incomplete list of hundreds of examples >>>>> across >>>>> the years, please see: >>>>> http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp >>>>> >>>>> When set to “0”, symlink following behavior is unrestricted. >>>>> >>>>> When set to “1” symlinks are permitted to be followed only when >>>>> outside a sticky world-writable directory **WE COULD POTENTIALLY SET >>>>> THIS OR SOME OTHER FLAG**, or when the **LABEL** of the symlink and >>>>> follower match, or when the directory **LABEL** matches the symlink’s >>>>> **LABEL**. >>>>> >>>>> This protection is based on the restrictions in Openwall and >>>>> grsecurity. >>>>> >>> >>> That's the /proc/sys/fs/protected_symlinks feature I mentioned in my >>> email above. It isn't based on the target of the symlink; it is only >>> based on the attributes of the follower process (e.g. root), the >>> attributes of the parent directory containing the symlink (e.g. /tmp), >>> and the attributes of the symlink file (e.g. /tmp/foo -> /etc/shadow). >>> At no point is it checking anything about the target of the symlink, >>> e.g. /etc/shadow. If dwalsh creates a symlink under /tmp (ln -s >>> /etc/shadow /tmp/foo) and root tries to follow /tmp/foo, then that >>> will fail because 1) the process fsuid (root) != the /tmp/foo symlink >>> owner (dwalsh), and 2) /tmp is a sticky and world-writable directory, >>> and 3) the /tmp directory owner (root) != the /tmp/foo symlink owner >>> (dwalsh). Note that conditions (2) and (3) render the check useless >>> for your use case, since you want to prevent following any symlinks >>> writable by container processes in any directory within the container >>> filesystem, so the directory need not be world-writable/sticky and the >>> parent directory UID/label might be identical to the symlink UID/label. >> We we are mounting the file system (Most of the time), So we could add a >> flag to indicate that this is a protected file system. > > You are effectively already doing that by mounting with a context > mount that assigns container_file_t or whatever type to the > filesystem. You don't need something new there. Well yes with the Overlay Driver. Not with the VFS Driver and maybe not with fuse-overlay. > >>> >>> >>> The existing SELinux lnk_file read permission check enables you to >>> apply stronger label-based controls to all symlinks within the >>> container filesystem, not just ones in /tmp-like directories. Don't >>> allow unconfined_t or any other privileged domain read permission to >>> container_file_t:lnk_file (or preferably to any file type for which >>> :lnk_file create is allowed to container process domains), and you'll >>> never have to worry about them following a symlink writable by a >>> container process. This of course assumes that the container >>> filesystem is always labeled with a type that is untrusted, whether >>> via mount contexts or actual labels. >> >> But we want to allow domains to follow container_file_t links that point >> to container_file_t objects. Just not follow them if they point to >> other types. This means there is no Protection that I could write to a >> domain like unconfined_t to say only follow links when the types match. >> Or the types have allow rules. > > You really don't want programs on the host OS that are acting on a > container filesystem to ever follow any symlinks within it. It just > isn't a good idea; even if you limit it to intra-container symlinks, > then an attacker could use the host process to overwrite some file > within the container that wasn't directly writable by him. > > In any event, I don't know how one would implement a check between the > symlink and its target; you'd have to save the symlink information > until you reach the final target and then call a hook with both of > them. And what if there are multiple symlinks in that path? Symlinks > to symlinks? > > > >