Hello! On Wed, Jun 28, 2023 at 12:31 PM Dominique Martinet <asmadeus@xxxxxxxxxxxxx> wrote: > > Dominique Martinet wrote on Wed, Jun 28, 2023 at 08:42:41PM +0900: > > If flags already has either MFD_EXEC or MFD_NOEXEC_SEAL, you don't check > > the sysctl at all. > > [...repro snipped..] > > > > What am I missing? > > (Perhaps the intent is just to force people to use the flag so it is > easier to check for memfd_create in seccomp or other LSM? > But I don't see why such a check couldn't consider the absence of a flag > as well, so I don't see the point.) > Yes. There is consideration to motivate app devs to migrate their code to use the new EXEC/NOEXEC_SEAL flag for memfd_create, if that answers your question. > > > BTW I find the current behaviour rather hard to use: setting this to 2 > > should still set NOEXEC by default in my opinion, just refuse anything > > that explicitly requested EXEC. > > And I just noticed it's not possible to lower the value despite having > CAP_SYS_ADMIN: what the heck?! I have never seen such a sysctl and it > just forced me to reboot because I willy-nilly tested in the init pid > namespace, and quite a few applications that don't require exec broke > exactly as I described below. > > If the user has CAP_SYS_ADMIN there are more container escape methods > than I can count, this is basically free pass to root on main namespace > anyway, you're not protecting anything. Please let people set the sysctl > to what they want. > Yama has a similar setting, for example, 3 (YAMA_SCOPE_NO_ATTACH) will not allow downgrading at runtime. Since this is a security feature, not allowing downgrading at run time is part of the security consideration. I hope you understand. > -- > Dominique Martinet | Asmadeus Thanks! -Jeff