* Jeff Xu <jeffxu@xxxxxxxxxxxx> [250213 17:00]: > On Thu, Feb 13, 2025 at 12:54 PM Liam R. Howlett > <Liam.Howlett@xxxxxxxxxx> wrote: > > > > > > > > > > > VM_SEALED isn't defined in 32-bit systems, and mseal.c isn't part of > > > > > the build. This is intentional. Any 32-bit code trying to use the > > > > > sealing function or the VM_SEALED flag will immediately fail > > > > > compilation. This makes it easier to identify incorrect usage. > > > > > > > > So you are against using the #define because the VM_SYSTEM_SEAL will be > > > > defined, even though it will be VM_NONE? This is no worse than a > > > > function that returns 0, and it aligns better with what we have today in > > > > that it can be put in the list of other flags. > > > > > > When I was reading through all of this and considering the history of > > > its development goals, it strikes me that a function is easier for the > > > future if/when this can be made a boot-time setting. > > > > > > > Reworking this change to function as a boot-time parameter, or whatever, > > would not be a significant amount of work, if/when the time comes. > > Since we don't know what the future holds, it it unnecessary to engineer > > in a potential change for a future version when the change is easy > > enough to make later and keep the code cleaner now. > > > Sure, I will put the function in mm.h for this patch. We can find a > proper place when it is time to implement the boot-time parameter > change. > > The call stack for sealing system mapping is something like below: > > install_special_mapping (mm/mmap.c) > map_vdso (arch/x86/entry/vdso/vma.c) > load_elf_binary (fs/binfmt_elf.c) > load_misc_binary (fs/binfmt_misc.c) > bprm_execve (fs/exec.c) > do_execveat_common > __x64_sys_execve > do_syscall_64 > > IMO, there's a clear divide between the API implementer and the API user. > mm and mm.h are the providers, offering the core mm functionality > through APIs/data structures like install_special_mapping(). > > The exe layer (bprm_execve, map_vdso, etc) is the consumer of the > install_special_mapping. > The logic related to checking if sealing system mapping is enabled > belongs to the exe layer. Since this is an all or nothing enabling, there is no reason to have each caller check the same thing and do the same action. You should put the logic into the provider - they all end up doing the same thing. Also, this is a compile time option so it doesn't even need to be checked on execution - just apply it in the first place, at the source. Your static inline function was already doing this...? I'm confused as to what you are arguing here because it goes against what you had and what I suggested. The alternative you are suggesting is more code, more instructions, and the best outcome is the same result. > > > > > > If mm maintainers prefer a #define for now, that's fine of course. The > > > value of VM_SYSTEM_SEAL can be VM_NONE on 32-bit. > > > > Thanks. I think having a flag with VM_NONE on 32-bit is just as sane as > > a "flags |= system_seal()" call that unconditionally returns 0 on > > 32-bit. > > > Consider the case below in src/third_party/kernel/v6.6/fs/proc/task_mmu.c, > > #ifdef CONFIG_64BIT > [ilog2(VM_SEALED)] = "sl", > #endif > > If #ifdef CONFIG_64BIT is missing, it won't be detected during compile time. > > Setting VM_SEALED to VM_NONE could simplify the coding in some cases > (get/set case), but it might make other cases error prone. > > I would prefer to not have VM_SEALED for 32 bit. But what I posted leaves VM_SEALED undefined for 32 bit, it defines VM_SYSTEM_SEALED which can be placed, for instance, into _install_special_mapping() arguments directly. Reducing the change to just adding a new flag to the list. And it's applied to all system mappings based on if it's enabled on compile or not. Also: include/linux/mm.h:#define VM_NONE 0x00000000 so, I'm not sure what evaluation you are concerned about? It would be defined as 0. Thanks, Liam