On Thu, Feb 1, 2024 at 8:10 PM Theo de Raadt <deraadt@xxxxxxxxxxx> wrote: > > Jeff Xu <jeffxu@xxxxxxxxxxxx> wrote: > > > On Thu, Feb 1, 2024 at 7:54 PM Theo de Raadt <deraadt@xxxxxxxxxxx> wrote: > > > > > > Jeff Xu <jeffxu@xxxxxxxxxxxx> wrote: > > > > > > > On Thu, Feb 1, 2024 at 3:11 PM Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > > > > > > > > > On Wed, Jan 31, 2024 at 05:50:24PM +0000, jeffxu@xxxxxxxxxxxx wrote: > > > > > > [PATCH v8 2/4] mseal: add mseal syscall > > > > > [...] > > > > > > +/* > > > > > > + * The PROT_SEAL defines memory sealing in the prot argument of mmap(). > > > > > > + */ > > > > > > +#define PROT_SEAL 0x04000000 /* _BITUL(26) */ > > > > > > + > > > > > > /* 0x01 - 0x03 are defined in linux/mman.h */ > > > > > > #define MAP_TYPE 0x0f /* Mask for type of mapping */ > > > > > > #define MAP_FIXED 0x10 /* Interpret addr exactly */ > > > > > > @@ -33,6 +38,9 @@ > > > > > > #define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be > > > > > > * uninitialized */ > > > > > > > > > > > > +/* map is sealable */ > > > > > > +#define MAP_SEALABLE 0x8000000 /* _BITUL(27) */ > > > > > > > > > > IMO this patch is misleading, as it claims to just be adding a new syscall, but > > > > > it actually adds three new UAPIs, only one of which is the new syscall. The > > > > > other two new UAPIs are new flags to the mmap syscall. > > > > > > > > > The description does include all three. I could update the patch title. > > > > > > > > > Based on recent discussions, it seems the usefulness of the new mmap flags has > > > > > not yet been established. Note also that there are only a limited number of > > > > > mmap flags remaining, so we should be careful about allocating them. > > > > > > > > > > Therefore, why not start by just adding the mseal syscall, without the new mmap > > > > > flags alongside it? > > > > > > > > > > I'll also note that the existing PROT_* flags seem to be conventionally used for > > > > > the CPU page protections, as opposed to kernel-specific properties of the VMA > > > > > object. As such, PROT_SEAL feels a bit out of place anyway. If it's added at > > > > > all it perhaps should be a MAP_* flag, not PROT_*. I'm not sure this aspect has > > > > > been properly discussed yet, seeing as the patchset is presented as just adding > > > > > sys_mseal(). Some reviewers may not have noticed or considered the new flags. > > > > > > > > > MAP_ flags is more used for type of mapping, such as MAP_FIXED_NOREPLACE. > > > > > > > > The PROT_SEAL might make more sense because sealing the protection bit > > > > is the main functionality of the sealing at this moment. > > > > > > Jeff, please show a piece of software that needs to do PROT_SEAL as > > > mprotect() or mmap() argument. > > > > > I didn't propose mprotect(). > > > > for mmap() here is a potential use case: > > > > fs/binfmt_elf.c > > if (current->personality & MMAP_PAGE_ZERO) { > > /* Why this, you ask??? Well SVr4 maps page 0 as read-only, > > and some applications "depend" upon this behavior. > > Since we do not have the power to recompile these, we > > emulate the SVr4 behavior. Sigh. */ > > > > error = vm_mmap(NULL, 0, PAGE_SIZE, > > PROT_READ | PROT_EXEC, <-- add PROT_SEAL > > MAP_FIXED | MAP_PRIVATE, 0); > > } > > > > I don't see the benefit of RWX page 0, which might make a null > > pointers error to become executable for some code. > > > > And this is a lot faster than doing the operation as a second step? > > > But anyways, that's kernel code. It is not userland exposed API used > by programs. > > The question is the damage you create by adding API exposed to > userland (since this is Linux: forever). > > I should be the first person thrilled to see Linux make API/ABI mistakes > they have to support forever, but I can't be that person. > Point taken. I can remove PROT_SEAL. > >