Hi Lorenzo On Wed, Nov 13, 2024 at 12:36 PM Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> wrote: > > You left kernel test bots and review unanswered on v2, which makes it > difficult to know whether you actually addressed everything. > Thanks for reminding me, I got distracted previously. I responded to the test bots. > Please respond to all outstanding comments in the v2 thread so we know, > thanks, even if it is to say 'this is no longer an issue as v3 addresses'. > All comments of v2 were addressed in V3. > On Wed, Nov 13, 2024 at 07:16:01PM +0000, jeffxu@xxxxxxxxxxxx wrote: > > From: Jeff Xu <jeffxu@xxxxxxxxxxxx> > > > > Seal vdso, vvar, sigpage, uprobes and vsyscall. > > > > Those mappings are readonly or executable only, sealing can protect > > them from ever changing or unmapped during the life time of the process. > > For complete descriptions of memory sealing, please see mseal.rst [1]. > > > > System mappings such as vdso, vvar, and sigpage (for arm) are > > generated by the kernel during program initialization, and are > > sealed after creation. > > > > Unlike the aforementioned mappings, the uprobe mapping is not > > established during program startup. However, its lifetime is the same > > as the process's lifetime [1]. It is sealed from creation. > > > > The vdso, vvar, sigpage, and uprobe mappings all invoke the > > _install_special_mapping() function. As no other mappings utilize this > > function, it is logical to incorporate sealing logic within > > _install_special_mapping(). This approach avoids the necessity of > > modifying code across various architecture-specific implementations. > > Some arches unmap VDSO's which mseal prevents, so are those broken now? Did > you test this? > Do you happen to know which arch might unmap vdso ? The information I collected so far is only CHECKPOINT_RESTORE would remap/unmap vdso. And if CHECKPOINT_RESTORE is enabled, Kconfig will prevent this from being enabled. > > > > The vsyscall mapping, which has its own initialization function, is > > sealed in the XONLY case, it seems to be the most common and secure > > case of using vsyscall. > > > > It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may > > alter the mapping of vdso, vvar, and sigpage during restore > > operations. Consequently, this feature cannot be universally enabled > > across all systems. To address this, a kernel configuration option has > > been introduced to enable or disable this functionality. > > > > [1] Documentation/userspace-api/mseal.rst > > [2] https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@xxxxxxxxxxxxxx/ > > I don't see any mention to testing, and this affects multiple different > architectures. > > Please describe what testing you performed and on what architectures. > The tests are done in ChromeOS and Android on ARM and INTEL. > I suggest we allow this only for architectures you have explicitly tested, > especially as this is 'hidden' from arch maintainers who might find this > change surprising. > I thought the current approach aligns with Linus's suggestion of unifying vdso/vvar code [1]. I honestly think this is not architecture dependent, i.e. this doesn't require any specific CPU feature. I could add ARCH_HAS_SEAL_SYSTEM_MAPPINGS in KCONFIG and enable this for x86_64 and arm64 for now, this would allow other architecture maintainers to have opportunities to test this . [1] https://lore.kernel.org/all/CAHk-=wgTXVMBRuya5J0peujSrtunehRtzk=WVrm6njPhHrpTJw@xxxxxxxxxxxxxx/ Thanks for reviewing. -Jeff