Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: > On Jul 15, 2015 3:34 PM, "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> wrote: >> >> Seth Forshee <seth.forshee@xxxxxxxxxxxxx> writes: >> >> > On Wed, Jul 15, 2015 at 04:06:35PM -0500, Eric W. Biederman wrote: >> >> Casey Schaufler <casey@xxxxxxxxxxxxxxxx> writes: >> >> >> >> > On 7/15/2015 12:46 PM, Seth Forshee wrote: >> >> >> These are the first in a larger set of patches that I've been working on >> >> >> (with help from Eric Biederman) to support mounting ext4 and fuse >> >> >> filesystems from within user namespaces. I've pushed the full series to: >> >> >> >> >> >> git://kernel.ubuntu.com/sforshee/linux.git userns-mounts >> >> >> >> >> >> Taking the series as a whole, the strategy is to handle as much of the >> >> >> heavy lifting as possible in the vfs so the filesystems don't have to >> >> >> handle weird edge cases. If you look at the full series you'll find that >> >> >> the changes in ext4 to support user namespace mounts turn out to be >> >> >> fairly minimal (fuse is a bit more complicated though as it must deal >> >> >> with translating ids for a userspace process which is running in pid and >> >> >> user namespaces). >> >> >> >> >> >> The patches I'm sending today lay some of the groundwork in the vfs and >> >> >> related code. They fall into two broad groups: >> >> >> >> >> >> 1. Patches 1-2 add s_user_ns and simplify MNT_NODEV handling. These are >> >> >> pretty straightforward, and Eric has expressed interest in merging >> >> >> these patches soon. Note that patch 2 won't apply cleanly without >> >> >> Eric's noexec patches for proc and sys [1]. >> >> >> >> >> >> 2. Patches 2-7 tighten down security for mounts with s_user_ns != >> >> >> &init_user_ns. This includes updates to how file caps and suid are >> >> >> handled and LSM updates to ignore security labels on superblocks >> >> >> from non-init namespaces. >> >> >> >> >> >> The LSM changes in particular may not be optimal, as I don't have a >> >> >> lot of familiarity with this code, so I'd be especially appreciative >> >> >> of review of these changes and suggestions on how to improve them. >> >> > >> >> > Lukasz Pawelczyk <l.pawelczyk@xxxxxxxxxxx> proposed >> >> > LSM support in user namespaces ([RFC] lsm: namespace hooks) >> >> > that make a whole lot more sense than just turning off >> >> > the option of using labels on files. Gutting the ability >> >> > to use MAC in a namespace is a step down the road of >> >> > making MAC and namespaces incompatible. >> >> >> >> This is not "turning off the option to use labels on files". >> >> >> >> This is supporting mounting filesystems like ext4 by unprivileged users >> >> and not trusting the labels they set in the same way as we trust labels >> >> on filesystems mounted by privileged users. >> >> >> >> The first step needs to be not trusting those labels and treating such >> >> filesystems as filesystems without label support. I hope that is Seth >> >> has implemented. >> >> >> >> In the long run we can do more interesting things with such filesystems >> >> once the appropriate LSM policy is in place. >> > >> > Yes, this exactly. Right now it looks to me like the only safe thing to >> > do with mounts from unprivileged users is to ignore the security labels, >> > so that's what I'm trying to do with these changes. If there's some >> > better thing to do, or some better way to do it, I'm more than happy to >> > receive that feedback. >> >> Ugh. >> >> This made me realize that we have an interesting problem here. An >> unprivileged mount of tmpfs probably needs to have >> s_user_ns == &init_user_ns. >> >> Otherwise we will break security labels on tmpfs for no good reason. >> ramfs and sysfs also seem to have similar concerns. >> >> Because they have no backing store we can trust those filesystems with >> security labels. Plus for at least sysfs there is the security label >> bleed through issue, that we need to make certain works. >> >> Perhaps these filesystems with trusted backing store need to call >> "sget_userns(..., &init_user_ns)". >> >> If we don't get this right we will have significant regressions with >> respect to security labels, and that is not ok. > > That's only a problem if there's anyone who sets security labels on > such a mount. You need global caps to do that (I hope), which > requires someone outside the userns to help, which means there's a > good chance that literally no one does this. Fair enough. That is however something we need to test. If no one puts security labels or file caps on such a mount we can change things. If not we can't because it would introduce regressions. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html