Re: [PATCH v9 0/7] mseal system mappings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 4, 2025 at 9:51 PM Lorenzo Stoakes
<lorenzo.stoakes@xxxxxxxxxx> wrote:
>
> On Wed, Mar 05, 2025 at 02:17:04AM +0000, jeffxu@xxxxxxxxxxxx wrote:
> > From: Jeff Xu <jeffxu@xxxxxxxxxxxx>
> >
> > This is V9 version, addressing comments from V8, without code logic
> > change.
> >
> > -------------------------------------------------------------------
> > As discussed during mseal() upstream process [1], mseal() protects
> > the VMAs of a given virtual memory range against modifications, such
> > as the read/write (RW) and no-execute (NX) bits. For complete
> > descriptions of memory sealing, please see mseal.rst [2].
> >
> > The mseal() is useful to mitigate memory corruption issues where a
> > corrupted pointer is passed to a memory management system. For
> > example, such an attacker primitive can break control-flow integrity
> > guarantees since read-only memory that is supposed to be trusted can
> > become writable or .text pages can get remapped.
> >
> > The system mappings are readonly only, memory sealing can protect
> > them from ever changing to writable or unmmap/remapped as different
> > attributes.
> >
> > System mappings such as vdso, vvar, vvar_vclock,
> > vectors (arm compat-mode), sigpage (arm compat-mode),
> > are created by the kernel during program initialization, and could
> > be sealed after creation.
> >
> > Unlike the aforementioned mappings, the uprobe mapping is not
> > established during program startup. However, its lifetime is the same
> > as the process's lifetime [3]. It could be sealed from creation.
> >
> > The vsyscall on x86-64 uses a special address (0xffffffffff600000),
> > which is outside the mm managed range. This means mprotect, munmap, and
> > mremap won't work on the vsyscall. Since sealing doesn't enhance
> > the vsyscall's security, it is skipped in this patch. If we ever seal
> > the vsyscall, it is probably only for decorative purpose, i.e. showing
> > the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored.
> >
> > It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may
> > alter the system mappings during restore operations. UML(User Mode Linux)
> > and gVisor, rr are also known to change the vdso/vvar mappings.
> > Consequently, this feature cannot be universally enabled across all
> > systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default.
> >
> > To support mseal of system mappings, architectures must define
> > CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special
> > mappings calls to pass mseal flag. Additionally, architectures must
> > confirm they do not unmap/remap system mappings during the process
> > lifetime. The existence of this flag for an architecture implies that
> > it does not require the remapping of thest system mappings during
> > process lifetime, so sealing these mappings is safe from a kernel
> > perspective.
> >
> > This version covers x86-64 and arm64 archiecture as minimum viable feature.
> >
> > While no specific CPU hardware features are required for enable this
> > feature on an archiecture, memory sealing requires a 64-bit kernel. Other
> > architectures can choose whether or not to adopt this feature. Currently,
> > I'm not aware of any instances in the kernel code that actively
> > munmap/mremap a system mapping without a request from userspace. The PPC
> > does call munmap when _install_special_mapping fails for vdso; however,
> > it's uncertain if this will ever fail for PPC - this needs to be
> > investigated by PPC in the future [4]. The UML kernel can add this support
> > when KUnit tests require it [5].
> >
> > In this version, we've improved the handling of system mapping sealing from
> > previous versions, instead of modifying the _install_special_mapping
> > function itself, which would affect all architectures, we now call
> > _install_special_mapping with a sealing flag only within the specific
> > architecture that requires it. This targeted approach offers two key
> > advantages: 1) It limits the code change's impact to the necessary
> > architectures, and 2) It aligns with the software architecture by keeping
> > the core memory management within the mm layer, while delegating the
> > decision of sealing system mappings to the individual architecture, which
> > is particularly relevant since 32-bit architectures never require sealing.
> >
> > Prior to this patch series, we explored sealing special mappings from
> > userspace using glibc's dynamic linker. This approach revealed several
> > issues:
> > - The PT_LOAD header may report an incorrect length for vdso, (smaller
> >   than its actual size). The dynamic linker, which relies on PT_LOAD
> >   information to determine mapping size, would then split and partially
> >   seal the vdso mapping. Since each architecture has its own vdso/vvar
> >   code, fixing this in the kernel would require going through each
> >   archiecture. Our initial goal was to enable sealing readonly mappings,
> >   e.g. .text, across all architectures, sealing vdso from kernel since
> >   creation appears to be simpler than sealing vdso at glibc.
> > - The [vvar] mapping header only contains address information, not length
> >   information. Similar issues might exist for other special mappings.
> > - Mappings like uprobe are not covered by the dynamic linker,
> >   and there is no effective solution for them.
> >
> > This feature's security enhancements will benefit ChromeOS, Android,
> > and other high security systems.
> >
> > Testing:
> > This feature was tested on ChromeOS and Android for both x86-64 and ARM64.
> > - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly,
> >   i.e. "sl" shown in the smaps for those mappings, and mremap is blocked.
> > - Passing various automation tests (e.g. pre-checkin) on ChromeOS and
> >   Android to ensure the sealing doesn't affect the functionality of
> >   Chromebook and Android phone.
> >
> > I also tested the feature on Ubuntu on x86-64:
> > - With config disabled, vdso/vvar is not sealed,
> > - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK,
> >   normal operations such as browsing the web, open/edit doc are OK.
> >
> > Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@xxxxxxxxxxxx/ [1]
> > Link: Documentation/userspace-api/mseal.rst [2]
> > Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@xxxxxxxxxxxxxx/ [3]
> > Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@xxxxxxxxxxxxxx/ [4]
> > Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5]
> >
> > -------------------------------------------
> > History:
> >
> > V9:
> >  - Add negative test in selftest (Kees Cook)
> >  - fx typos in text (Kees Cook)
>
> You have a bad habit of missing stuff off these logs. Usually I don't
> comment, as it's trivial, but while we're here :)
>
> Please try to keep an accurate log of changes requested so you can populate
> these properly.
>
> Obviously this is not going to block anything. But for future reference...
>
>   - Add selftest to main selftest Makefile (Lorenzo Stoakes)
>
> >
> > V8:
>
> Nit, but no lore link?
https://lore.kernel.org/all/20250303050921.3033083-1-jeffxu@xxxxxxxxxx/

Thanks for noticing this.

>
> >   - Change ARCH_SUPPORTS_MSEAL_X to ARCH_SUPPORTS_MSEAL_X (Liam R. Howlett)
> >   - Update comments in Kconfig and mseal.rst (Lorenzo Stoakes, Liam R. Howlett)
> >   - Change patch header perfix to "mseal sysmap" (Lorenzo Stoakes)
> >   - Remove "vm_flags =" (Kees Cook, Liam R. Howlett,  Oleg Nesterov)
> >   - Drop uml architecture (Lorenzo Stoakes, Kees Cook)
> >   - Add a selftest to verify system mappings are sealed (Lorenzo Stoakes)
> >
> > V7:
> >   https://lore.kernel.org/all/20250224225246.3712295-1-jeffxu@xxxxxxxxxx/
> >   - Remove cover letter from the first patch (Liam R. Howlett)
> >   - Change macro name to VM_SEALED_SYSMAP (Liam R. Howlett)
> >   - logging and fclose() in selftest (Liam R. Howlett)
> >
> > V6:
> >   https://lore.kernel.org/all/20250224174513.3600914-1-jeffxu@xxxxxxxxxx/
> >   - mseal.rst: fix a typo (Randy Dunlap)
> >   - security/Kconfig: add rr into note (Liam R. Howlett)
> >   - remove mseal_system_mappings() and use macro instead (Liam R. Howlett)
> >   - mseal.rst: add incompatible userland software (Lorenzo Stoakes)
> >   - remove RFC from title (Kees Cook)
> >
> > V5
> >   https://lore.kernel.org/all/20250212032155.1276806-1-jeffxu@xxxxxxxxxx/
> >   - Remove kernel cmd line (Lorenzo Stoakes)
> >   - Add test info (Lorenzo Stoakes)
> >   - Add threat model info (Lorenzo Stoakes)
> >   - Fix x86 selftest: test_mremap_vdso
> >   - Restrict code change to ARM64/x86-64/UM arch only.
> >   - Add userprocess.h to include seal_system_mapping().
> >   - Remove sealing vsyscall.
> >   - Split the patch.
> >
> > V4:
> >   https://lore.kernel.org/all/20241125202021.3684919-1-jeffxu@xxxxxxxxxx/
> >   - ARCH_HAS_SEAL_SYSTEM_MAPPINGS (Lorenzo Stoakes)
> >   - test info (Lorenzo Stoakes)
> >   - Update  mseal.rst (Liam R. Howlett)
> >   - Update test_mremap_vdso.c (Liam R. Howlett)
> >   - Misc. style, comments, doc update (Liam R. Howlett)
> >
> > V3:
> >   https://lore.kernel.org/all/20241113191602.3541870-1-jeffxu@xxxxxxxxxx/
> >   - Revert uprobe to v1 logic (Oleg Nesterov)
> >   - use CONFIG_SEAL_SYSTEM_MAPPINGS instead of _ALWAYS/_NEVER (Kees Cook)
> >   - Move kernel cmd line from fs/exec.c to mm/mseal.c and
> >     misc. (Liam R. Howlett)
> >
> > V2:
> >   https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@xxxxxxxxxx/
> >   - Seal uprobe always (Oleg Nesterov)
> >   - Update comments and description (Randy Dunlap, Liam R.Howlett, Oleg Nesterov)
> >   - Rebase to linux_main
> >
> > V1:
> >  - https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@xxxxxxxxxx/
> >
> > --------------------------------------------------
> >
> >
> >
> > Jeff Xu (7):
> >   mseal sysmap: kernel config and header change
> >   selftests: x86: test_mremap_vdso: skip if vdso is msealed
> >   mseal sysmap: enable x86-64
> >   mseal sysmap: enable arm64
> >   mseal sysmap: uprobe mapping
> >   mseal sysmap: update mseal.rst
> >   selftest: test system mappings are sealed.
> >
> >  Documentation/userspace-api/mseal.rst         |  20 +++
> >  arch/arm64/Kconfig                            |   1 +
> >  arch/arm64/kernel/vdso.c                      |  12 +-
> >  arch/x86/Kconfig                              |   1 +
> >  arch/x86/entry/vdso/vma.c                     |   7 +-
> >  include/linux/mm.h                            |  10 ++
> >  init/Kconfig                                  |  22 ++++
> >  kernel/events/uprobes.c                       |   3 +-
> >  security/Kconfig                              |  21 ++++
> >  tools/testing/selftests/Makefile              |   1 +
> >  .../mseal_system_mappings/.gitignore          |   2 +
> >  .../selftests/mseal_system_mappings/Makefile  |   6 +
> >  .../selftests/mseal_system_mappings/config    |   1 +
> >  .../mseal_system_mappings/sysmap_is_sealed.c  | 119 ++++++++++++++++++
> >  .../testing/selftests/x86/test_mremap_vdso.c  |  43 +++++++
> >  15 files changed, 261 insertions(+), 8 deletions(-)
> >  create mode 100644 tools/testing/selftests/mseal_system_mappings/.gitignore
> >  create mode 100644 tools/testing/selftests/mseal_system_mappings/Makefile
> >  create mode 100644 tools/testing/selftests/mseal_system_mappings/config
> >  create mode 100644 tools/testing/selftests/mseal_system_mappings/sysmap_is_sealed.c
> >
> > --
> > 2.48.1.711.g2feabab25a-goog
> >





[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux