Re: [PATCH v10 2/5] mseal: add mseal syscall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* jeffxu@xxxxxxxxxxxx <jeffxu@xxxxxxxxxxxx> [240415 12:35]:
> From: Jeff Xu <jeffxu@xxxxxxxxxxxx>
> 
> The new mseal() is an syscall on 64 bit CPU, and with
> following signature:
> 
> int mseal(void addr, size_t len, unsigned long flags)
> addr/len: memory range.
> flags: reserved.
> 
> mseal() blocks following operations for the given memory range.
> 
> 1> Unmapping, moving to another location, and shrinking the size,
>    via munmap() and mremap(), can leave an empty space, therefore can
>    be replaced with a VMA with a new set of attributes.
> 
> 2> Moving or expanding a different VMA into the current location,
>    via mremap().
> 
> 3> Modifying a VMA via mmap(MAP_FIXED).
> 
> 4> Size expansion, via mremap(), does not appear to pose any specific
>    risks to sealed VMAs. It is included anyway because the use case is
>    unclear. In any case, users can rely on merging to expand a sealed VMA.
> 
> 5> mprotect() and pkey_mprotect().
> 
> 6> Some destructive madvice() behaviors (e.g. MADV_DONTNEED) for anonymous
>    memory, when users don't have write permission to the memory. Those
>    behaviors can alter region contents by discarding pages, effectively a
>    memset(0) for anonymous memory.
> 
> Following input during RFC are incooperated into this patch:
> 
> Jann Horn: raising awareness and providing valuable insights on the
> destructive madvise operations.
> Linus Torvalds: assisting in defining system call signature and scope.
> Liam R. Howlett: perf optimization.
> Theo de Raadt: sharing the experiences and insight gained from
>   implementing mimmutable() in OpenBSD.
> 
> Finally, the idea that inspired this patch comes from Stephen Röttger’s
> work in Chrome V8 CFI.

No per-vma change is checked prior to entering a per-vma modification
loop today. This means that mseal() differs in behaviour in "up-front
failure" vs "partial change failure" that exists in every other
function.

I'm not saying it's wrong or that it's right - I'm just wondering what
the direction is here.  Either we should do as much up-front as
possible or keep with tradition and have (partial) success where
possible.

If you look at do_mprotect_pkey(), you can even see
map_deny_write_exec() being checked in a loop during modifications.

I think we can all agree that having some up-front and some later
without any reason will lead to a higher probability of things getting
missed.

Thanks,
Liam





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux