On Mon, Nov 25, 2024 at 12:49 PM <jeffxu@xxxxxxxxxxxx> wrote: > > From: Jeff Xu <jeffxu@xxxxxxxxxxxx> > > Seal vdso, vvar, sigpage, uprobes and vsyscall. > > Those mappings are readonly or executable only, sealing can protect > them from ever changing or unmapped during the life time of the process. > For complete descriptions of memory sealing, please see mseal.rst [1]. > > System mappings such as vdso, vvar, and sigpage (for arm) are > generated by the kernel during program initialization, and are > sealed after creation. > > Unlike the aforementioned mappings, the uprobe mapping is not > established during program startup. However, its lifetime is the same > as the process's lifetime [2]. It is sealed from creation. > > The vdso, vvar, sigpage, and uprobe mappings all invoke the > _install_special_mapping() function. As no other mappings utilize this > function, it is logical to incorporate sealing logic within > _install_special_mapping(). This approach avoids the necessity of > modifying code across various architecture-specific implementations. > > The vsyscall mapping, which has its own initialization function, is > sealed in the XONLY case, it seems to be the most common and secure > case of using vsyscall. > > It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may > alter the mapping of vdso, vvar, and sigpage during restore > operations. Consequently, this feature cannot be universally enabled > across all systems. > ... > > +config SEAL_SYSTEM_MAPPINGS > + bool "seal system mappings" > + default n > + depends on 64BIT > + depends on ARCH_HAS_SEAL_SYSTEM_MAPPINGS > + depends on !CHECKPOINT_RESTORE Hi Jeff, I like the idea of this patchset, but I don’t like the idea of forcing users to choose between this security feature and checkpoint/restore functionality. We need to explore ways to make this feature work with checkpoint/restore. Relying on CAP_CHECKPOINT_RESTORE is the obvious approach. CRIU just needs to move these mappings, and it doesn't need to change their properties or modify their contents. With that in mind, here are two options: * Allow moving sealed mappings for processes with CAP_CHECKPOINT_RESTORE. * Allow temporarily "unsealing" mappings for processes with CAP_CHECKPOINT_RESTORE. CRIU could unseal mappings, move them, and then seal them back. Another approach might be to make this feature configurable on a per-process basis (e.g., via prctl). Once enabled for a process, it would be inherited by all its children. It can't be disabled unless a process has CAP_CHECKPOINT_RESTORE. I've added Mike, Dima, and Alex to the thread. They might have other ideas. Thanks, Andrei