On Tue, Nov 01, 2022 at 04:14:39PM -0700, Jeff Xu wrote: > Sorry for the long overdue reply. No worries! I am a fan of thread necromancy. :) > [...] > 1> memfd_create: > Add two flags: > #define MFD_EXEC 0x0008 > #define MFD_NOEXEC_SEAL 0x0010 > This lets application to set executable bit explicitly. > (If application set both, it will be rejected) So no MFD_NOEXEC without seal? (I'm fine with that.) > 2> For old application that doesn't set executable bit: > Add a pid name-spaced sysctl.kernel.pid_mfd_noexec, with: bikeshed: vm.memfd_noexec (doesn't belong in "kernel", and seems better suited to "vm" than "fs") > value = 0: Default_EXEC > Honor MFD_EXEC and MFD_NOEXEC_SEAL > When none is set, will fall back to original behavior (EXEC) Yeah. Rephrasing for myself to understand more clearly: "memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL act like MFD_EXEC was set." > value = 1: Default_NOEXEC_SEAL > Honor MFD_EXEC and MFD_NOEXEC_SEAL > When none is set, will default to MFD_NOEXEC_SEAL "memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL act like MFD_NOEXEC_SEAL was set." Also, I think there should be a pr_warn_ratelimited() when memfd_create() is used without either bit, so that there is some pressure to please adjust their API calls to explicitly set a bit. > 3> Add a pid name-spaced sysctl kernel.pid_mfd_noexec_enforced: with: > value = 0: default, not enforced. > value = 1: enforce NOEXEC_SEAL (overwrite everything) How about making this just mode "value 2" for the first sysctl? "memfd_create() without MFD_NOEXEC_SEAL will be rejected." -Kees -- Kees Cook