Re: [PATCH v2] proc: "mount -o lookup=" support

Alexey Gladkov <legion@xxxxxxxxxx> · Wed, 19 Jan 2022 19:30:00 +0100

On Wed, Jan 19, 2022 at 06:31:07PM +0100, Christian Brauner wrote:
> On Wed, Jan 19, 2022 at 06:15:22PM +0100, Alexey Gladkov wrote:
> > On Wed, Jan 19, 2022 at 05:24:23PM +0100, Christian Brauner wrote:
> > > On Wed, Jan 19, 2022 at 06:48:03PM +0300, Alexey Dobriyan wrote:
> > > > From 61376c85daab50afb343ce50b5a97e562bc1c8d3 Mon Sep 17 00:00:00 2001
> > > > From: Alexey Dobriyan <adobriyan@xxxxxxxxx>
> > > > Date: Mon, 22 Nov 2021 20:41:06 +0300
> > > > Subject: [PATCH 1/1] proc: "mount -o lookup=..." support
> > > > 
> > > > Docker implements MaskedPaths configuration option
> > > > 
> > > > 	https://github.com/estesp/docker/blob/9c15e82f19b0ad3c5fe8617a8ec2dddc6639f40a/oci/defaults.go#L97
> > > > 
> > > > to disable certain /proc files. It overmounts them with /dev/null.
> > > > 
> > > > Implement proper mount option which selectively disables lookup/readdir
> > > > in the top level /proc directory so that MaskedPaths doesn't need
> > > > to be updated as time goes on.
> > > 
> > > I might've missed this when this was sent the last time so maybe it was
> > > clearly explained in an earlier thread: What's the reason this needs to
> > > live in the kernel?
> > > 
> > > The MaskedPaths entry is optional so runtimes aren't required to block
> > > anything by default and this mostly makes sense for workloads that run
> > > privileged.
> > > 
> > > In addition MaskedPaths is a generic option which allows to hide any
> > > existing path, not just proc. Even in the very docker-specific defaults
> > > /sys/firmware is covered.
> > > 
> > > I do see clear value in the subset= and hidepid= options. They are
> > > generally useful independent of opinionated container workloads. I don't
> > > see the same for lookup=.
> > > 
> > > An alternative I find more sensible is to add a new value for subset=
> > > that hides anything(?) that only global root should have read/write
> > > access too.
> > 
> > Or we can allow to change permissions in the procfs only in the direction
> > of decreasing (if some file has 644 then allow to set 640 or 600). In this
> > case, we will not need to constantly check the whitelist.
> 
> I don't fancy any filtering or allowlist approach. I find that rather
> inelegant.

Yep. I also don't find it very convenient if you need to allow more than
one or two files. That's why I didn't do anything like that when I
implemented subset=.

> But if I understand you correctly is that if we were to have
> decreasing permissions we could allow a (namespace) procfs-admin to set
> permissions so that the relevant files are essentially read-only or not
> even readable at all for container workloads. So once you've lowered
> perms you can't raise them which ensures even namespace procfs-admin
> can't raise them again.

Yes. This is what I meant.

> Might work as well. But that implies that we wouldn't need any allowlist
> at all afaict.

Yes, in this case we don't need a list.

-- 
Rgrds, legion