Re: [PATCH 00/10] Extend and tweak mapping support

Amir Goldstein <amir73il@xxxxxxxxx> · Tue, 30 Nov 2021 07:51:41 +0200

On Tue, Nov 23, 2021 at 2:16 PM Christian Brauner <brauner@xxxxxxxxxx> wrote:
>
> From: Christian Brauner <christian.brauner@xxxxxxxxxx>
>
> Hey,
>
> This series extend the mapping infrastructure in order to support mapped
> mounts of mapped filesystems in the future.
>
> Currently we only support mapped mounts of filesystems mounted without an
> idmapping. This was a consicous decision mentioned in multiple places. For
> example, see [1].
>
> In our mapping documentation in [3] we explained in detail that it is
> perfectly fine to extend support for mapped mounts to filesystem's mounted
> with an idmapping should the need arise. The need has been there for some
> time now (cf. [2]).
>
> Before we can port any such filesystem we need to first extend the mapping
> helpers to account for the filesystem's idmapping in the remapping helpers.
> This again, is explained at length in our documentation at [3].
>
> Currently, the low-level mapping helpers implement the remapping algorithms
> described in [3] in a simplified manner. Because we could rely on the fact
> that all filesystems supporting mapped mounts are mounted without an
> idmapping the translation step from or into the filesystem idmapping could
> be skipped.
>
> In order to support mapped mounts of filesystem's mountable with an
> idmapping the translation step we were able to skip before cannot be
> skipped anymore. A filesystem mounted with an idmapping is very likely to
> not use an identity mapping and will instead use a non-identity mapping. So
> the translation step from or into the filesystem's idmapping in the
> remapping algorithm cannot be skipped for such filesystems. More details
> with examples can be found in [3].
>
> This series adds a few new as well as prepares and tweaks some already
> existing low-level mapping helpers to perform the full translation
> algorithm explained in [3]. The low-level helpers can be written in a way
> that they only perform the additional translation step when the filesystem
> is indeed mounted with an idmapping.
>
> Since we don't yet support such a filesystem yet a kernel was compiled
> carrying a trivial patch making ext4 mountable with an idmapping:
>
> # We're located on the host with the initial idmapping.
> ubuntu@f2-vm:~$ cat /proc/self/uid_map
>          0          0 4294967295
>
> # Mount an ext4 filesystem with the initial idmapping.
> ubuntu@f2-vm:~$ sudo mount -t ext4 /dev/loop0 /mnt
>
> # The filesystem contains two files. One owned by id 0 and another one owned by
> # id 1000 in the initial idmapping.
> ubuntu@f2-vm:~$ ls -al /mnt/
> total 8
> drwxrwxrwx  2 root   root   4096 Nov 22 17:04 .
> drwxr-xr-x 24 root   root   4096 Nov 20 11:24 ..
> -rw-r--r--  1 root   root      0 Nov 22 17:04 file_init_mapping_0
> -rw-r--r--  1 ubuntu ubuntu    0 Nov 22 17:04 file_init_mapping_1000
>
> # Umount it again so we we can mount it in another namespace later.
> ubuntu@f2-vm:~$ sudo umount  /mnt
>
> # Use the lxc-usernsexec binary to run a shell in a user and mount namespace
> # with an idmapping of 0:10000:100000000.
> #
> # This idmapping will have the effect that files which are owned by i_{g,u}id
> # 10000 and files that are owned by i_{g,u}id 11000 will be owned by {g,u}id
> # 0 and {g,u}id 1000 with that namespace respectively.
> ubuntu@f2-vm:~$ sudo lxc-usernsexec -m b:0:10000:100000000 -- bash
>
> # Verify that we're really running with the expected idmapping.
> root@f2-vm:/home/ubuntu# cat /proc/self/uid_map
>          0      10000  100000000
>
> # Mount the ext4 filesystem in the user and mountns with the idmapping
> # 0:10000:100000000.
> #
> # Note, that this requires a test kernel that makes ext4 mountable in a
> # non-initial userns. The patch is simply:
> #
> # diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> # index 4e33b5eca694..0221e8211e5b 100644
> # --- a/fs/ext4/super.c
> # +++ b/fs/ext4/super.c
> # @@ -6584,7 +6584,7 @@ static struct file_system_type ext4_fs_type = {
> #         .name           = "ext4",
> #         .mount          = ext4_mount,
> #         .kill_sb        = kill_block_super,
> # -       .fs_flags       = FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
> # +       .fs_flags       = FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_USERNS_MOUNT,
> #  };
> #  MODULE_ALIAS_FS("ext4");
> root@f2-vm:/home/ubuntu# mount -t ext4 /dev/loop0 /mnt
>

Hi Christian,

I have a question not directly related to the patches, but to the test
hack above.
I may be wrong, but it looks like an idmapped sb would be desired for some
use cases(?).

My question is - could we use fsconfig() to allow CAP_SYS_ADMIN to attach
a newly mounted sb (e.g. ext4) to user ns without allowing mount of ext4 from
within userns? Wouldn't that add usability to some users without adding any
new risks over the risks already subjected with idmapped mounts?

Thanks,
Amir.