On Wed, Feb 12, 2025 at 7:05 AM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote: > ... > > > > In this version, we've improved the handling of system mapping sealing from > > previous versions, instead of modifying the _install_special_mapping > > function itself, which would affect all architectures, we now call > > _install_special_mapping with a sealing flag only within the specific > > architecture that requires it. This targeted approach offers two key > > advantages: 1) It limits the code change's impact to the necessary > > architectures, and 2) It aligns with the software architecture by keeping > > the core memory management within the mm layer, while delegating the > > decision of sealing system mappings to the individual architecture, which > > is particularly relevant since 32-bit architectures never require sealing. > > > > Additionally, this patch introduces a new header, > > include/linux/usrprocess.h, which contains the mseal_system_mappings() > > function. This function helps the architecture determine if system > > mapping is enabled within the current kernel configuration. It can be > > extended in the future to handle opt-in/out prctl for enabling system > > mapping sealing at the process level or a kernel cmdline feature. > > > > A new header file was introduced because it was difficult to find the > > best location for this function. Although include/linux/mm.h was > > considered, this feature is more closely related to user processes > > than core memory management. Additionally, future prctl or kernel > > cmd-line implementations for this feature would not fit within the > > scope of core memory management or mseal.c. This is relevant because > > if we add unit-tests for mseal.c in the future, we would want to limit > > mseal.c's dependencies to core memory management. > > ... > > > > diff --git a/include/linux/userprocess.h b/include/linux/userprocess.h > > new file mode 100644 > > index 000000000000..bd11e2e972c5 > > --- /dev/null > > +++ b/include/linux/userprocess.h > > @@ -0,0 +1,18 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#ifndef _LINUX_USER_PROCESS_H > > +#define _LINUX_USER_PROCESS_H > > +#include <linux/mm.h> > > Why is this in a new file and not mm.h directly? Anyone who needs to > use this will pull in mm.h anyways, and probably already has mm.h > included. > The commit message includes the reason. I've snipped and left the relevant portion for easy reference, please see the beginning of this email. > > + > > +/* > > + * mseal of userspace process's system mappings. > > + */ > > +static inline unsigned long mseal_system_mappings(void) > > +{ > > +#ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > > + return VM_SEALED; > > +#else > > + return 0; > > +#endif > > +} > > + > > +#endif > > Looking in mm.h, there are examples of config checks which either set > the bit or set it to 0. This means you wouldn't need the function at > all and the precompiler would do all the work (although with a static > inline, I'm not sure it would make a big difference in instructions). > > For instance, you could do: > #ifdef CONFIG_64BIT > /* VM is sealed, in vm_flags */ > #define VM_SEALED _BITUL(63) > > #ifdef CONFIG_MSEAL_SYSTEM_MAPPINGS > #define VM_SYSTEM_SEAL VM_SEALED > else > #define VM_SYSTEM_SEAL VM_NONE > #endif > > #else /* CONFIG_64BIT */ > #define VM_SYSTEM_SEAL VM_NONE > #endif > > Then you can use VM_SYSTEM_SEAL in the system mappings function in the > list of flags instead of having a variable + calling the static > function. > > I didn't look close enough to see if the 32bit version is necessary. > I'm aware of the other pattern. VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of the build. This is intentional. Any 32-bit code trying to use the sealing function or the VM_SEALED flag will immediately fail compilation. This makes it easier to identify incorrect usage. > > diff --git a/init/Kconfig b/init/Kconfig > > index d0d021b3fa3b..892d2bcdf397 100644 > > --- a/init/Kconfig > > +++ b/init/Kconfig > > @@ -1882,6 +1882,24 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS > > config ARCH_HAS_MEMBARRIER_SYNC_CORE > > bool > > > > +config ARCH_HAS_MSEAL_SYSTEM_MAPPINGS > > ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS is more clear. HAS may mean it's > supported or it could mean it's enabled. SUPPORTS is more clear that > this is set if an arch supports the feature. After all, I think HAS > here means it "has support for" mseal system mappings. > > I see that both are used (HAS and SUPPORTS), but I'm still not sure what > HAS means in other contexts without digging. > The existing pattern is to indicate arch support with CONFIG_ARCH_HAS_N and security/KConfig to have CONFIG_N as the main switch. For example, CONFIG_ARCH_HAS_FORTIFY_SOURCE and CONFIG_FORTIFY_SOURCE. Since the MSEAL_SYSTEM_MAPPINGS is placed in security/Kconfig, hence I'm following the existing pattern in security/Kconfig. > > diff --git a/security/Kconfig b/security/Kconfig > > index f10dbf15c294..bfb35fc7a3c6 100644 > > --- a/security/Kconfig > > +++ b/security/Kconfig > > @@ -51,6 +51,24 @@ config PROC_MEM_NO_FORCE > > > > endchoice > > > > +config MSEAL_SYSTEM_MAPPINGS > > + bool "mseal system mappings" > > + depends on 64BIT > > + depends on ARCH_HAS_MSEAL_SYSTEM_MAPPINGS > > + depends on !CHECKPOINT_RESTORE > > + help > > + Seal system mappings such as vdso, vvar, sigpage, uprobes, etc. > > + > > + A 64-bit kernel is required for the memory sealing feature. > > + No specific hardware features from the CPU are needed. > > + > > + Note: CHECKPOINT_RESTORE, UML, gVisor are known to relocate or > > + unmap system mapping, therefore this config can't be enabled > > + universally. > > I guess add rr to the list, since that's also reported to have issues as > well. > sure. Thanks for reviewing. -Jeff