Re: [PATCH v3 00/25] user_namespace: introduce fsid mappings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/18/20 9:33 AM, Christian Brauner wrote:
Hey everyone,

This is v3 after (off- and online) discussions with Jann the following
changes were made:
- To handle nested user namespaces cleanly, efficiently, and with full
   backwards compatibility for non fsid-mapping aware workloads we only
   allow writing fsid mappings as long as the corresponding id mapping
   type has not been written.
- Split the patch which adds the internal ability in
   kernel/user_namespace to verify and write fsid mappings into tree
   patches:
   1. [PATCH v3 04/25] fsuidgid: add fsid mapping helpers
      patch to implement core helpers for fsid translations (i.e.
      make_kfs*id(), from_kfs*id{_munged}(), kfs*id_to_k*id(),
      k*id_to_kfs*id()
   2. [PATCH v3 05/25] user_namespace: refactor map_write()
      patch to refactor map_write() in order to prepare for actual fsid
      mappings changes in the following patch. (This should make it
      easier to review.)
   3. [PATCH v3 06/25] user_namespace: make map_write() support fsid mappings
      patch to implement actual fsid mappings support in mape_write()
- Let the keyctl infrastructure only operate on kfsid which are always
   mapped/looked up in the id mappings similar to what we do for
   filesystems that have the same superblock visible in multiple user
   namespaces.

This version also comes with minimal tests which I intend to expand in
the future.

 From pings and off-list questions and discussions at Google Container
Security Summit there seems to be quite a lot of interest in this
patchset with use-cases ranging from layer sharing for app containers
and k8s, as well as data sharing between containers with different id
mappings. I haven't Cced all people because I don't have all the email
adresses at hand but I've at least added Phil now. :)

I put this into a kernel for our container guys to mess with in order to validate it would actually be useful for real world uses. I've cc'ed the guy who did all of the work in case you have specific questions.

Good news is the interface is acceptable, albeit apparently the whole user ns interface sucks in general. But you haven't made it worse, so success!

But in testing it there appears to be a problem with tmpfs? Our applications will use shared memory segments for certain things and it apparently breaks this in interesting ways, it appears to not shift the UID appropriately on tmpfs. This seems to be relatively straightforward to reproduce, but if you have trouble let me know and I'll come up with a shell script that reproduces the problem.

We are happy to continue testing these patches to make sure they're working in our container setup, if you want to CC me on future submissions I can build them for our internal testing and validate them as well. Thanks,

Josef



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux