* Jeff Xu <jeffxu@xxxxxxxxxxxx> [241111 14:10]: > Hi Liam > > On Thu, Oct 17, 2024 at 9:01 AM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote: > > > > > Does it make sense for this to live in exec? Couldn't you put it in the > > > > mm/mseal.c file? It's vma flags for mappings and you've put it in > > > > fs/exec? > > > > > > > If you are referring to utilities related to kernel cmdline, they > > > should be in this file. > > > > You created a wrapper for the command line, but then included the user > > in this file as well. > > > > hugetlbfs reads the command line as well, in cmdline_parse_hugetlb_cma. > > That parser lives with the rest of the hugetlb code in hugetlb.c > > > > I think this has to do with your view as this is an exec thing, where I > > think it's an mm thing. My arguments are that you are directly adding > > flags to vmas and it's dealing with mseal which has memory in the name > > with the file living in the mm/ directory. If I wanted to know what's > > using mseal, I'd start there and totally miss what you are adding here. > > > > Besides applying a vma flag to exec mappings, why do you feel like it > > belongs in fs/ ? > > > The vdso/vvar/stack/heap alike are type of mappings belonging to > processes, and are created during execve() syscall which is in > fs/exec.c. > > mm/mseal.c provides core memory sealing functionality and exec.c uses > it. IMO, it is better to keep the provider (mm/mseal.c) and consumer > (executable) separate. IMO, mseal should not be sitting in its own file since it's just a vma flag. The entire feature is to do with vma and vma flag checking. It should at least be globbed together in one file, as much as possible. The way this and the help text is written means there is only the VM_SEALED flag that links this to the actual feature name. And with all the abstractions, that flag is 2-3 functions deep across 2-3 files. In fact, there isn't a dependency for enabling mseal in the kconfig? Isn't that needed? Please fix this. > > To make modulization better, I can do below adjustment: > if (seal_system_mapping_enabled()) <-- implemented by fs/exec.c > add_vm_sealed() <- keep in include/linux/mm.h > > However, if you have a strong opinion on this, I could move the > parsing logic to mm/mseal. Yes, please move the parsing logic together with the other mseal code. > > > > > > +void update_seal_exec_system_mappings(unsigned long * > > > > The name is also very long and a bit odd, it could be used for other > > > > reasons, but you have _system_mappings on the end, and you use seal but > > > > it's mseal (or vm_seal)? Would mseal_flag() work? > > > > > > > It could be longer :-) > > > it means update_sealing_flag_for_executable_system_mappings. > > > mseal_flag is too short and not descriptive. > > > > mseal_exec_flags() ? > > > It needs to be more descriptive because there are also stacks and > heaps to be sealed. Stopping vma modifications to vmas that exist specifically to be dynamic does not sound wise. > I suggest to use below name to make it shorter: > > if (seal_system_mapping_enabled()) > add_vm_sealed() I am not sure what one you are renaming or what these functions would do. I think you need to look at your overall design and try to fit it into what exists instead of putting a call in a wrapper function (_install_special_mapping) to alter an argument. > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > > > index 57fd5ab2abe7..d4717e34a60d 100644 > > > > > --- a/mm/mmap.c > > > > > +++ b/mm/mmap.c > > > > > @@ -2133,6 +2133,7 @@ struct vm_area_struct *_install_special_mapping( > > > > > unsigned long addr, unsigned long len, > > > > > unsigned long vm_flags, const struct vm_special_mapping *spec) > > > > > { > > > > > + update_seal_exec_system_mappings(&vm_flags); > > > > > return __install_special_mapping(mm, addr, len, vm_flags, (void *)spec, > > > > > &special_mapping_vmops); > > > > > > > > If you were to return a flag, you could change the vm_flags argument to > > > > vm_flags | mseal_flag() > > > > > > > passing pointer seems to be the most efficient way. > > > > I disagree. Here is the godbolt.org output for gcc x86-64 14.2 of your > > code (with some added #defines to make it compile) > > > > seal_system_mappings: > > .long 1 > > seal_system_mappings_enabled: > > push rbp > > mov rbp, rsp > > mov eax, DWORD PTR seal_system_mappings[rip] > > cmp eax, 1 > > jne .L2 > > mov eax, 1 > > jmp .L3 > > .L2: > > mov eax, 0 > > .L3: > > pop rbp > > ret > > update_seal_exec_system_mappings: > > push rbp > > mov rbp, rsp > > sub rsp, 8 > > mov QWORD PTR [rbp-8], rdi > > mov rax, QWORD PTR [rbp-8] > > mov rax, QWORD PTR [rax] > > and eax, 2 > > test rax, rax > > jne .L6 > > call seal_system_mappings_enabled > > test al, al > > je .L6 > > mov rax, QWORD PTR [rbp-8] > > mov rax, QWORD PTR [rax] > > or rax, 2 > > mov rdx, rax > > mov rax, QWORD PTR [rbp-8] > > mov QWORD PTR [rax], rdx > > .L6: > > nop > > leave > > ret > > main: > > push rbp > > mov rbp, rsp > > sub rsp, 16 > > mov QWORD PTR [rbp-8], 0 > > lea rax, [rbp-8] > > mov rdi, rax > > call update_seal_exec_system_mappings > > mov rax, QWORD PTR [rbp-8] > > leave > > ret > > > > ----- 48 lines ----- > > Here is what I am suggesting to do with replacing the passing of a > > pointer with a concise "vm_flags | mseal_exec_flags()" (with the same > > added #defines to make it compile) > > > > seal_system_mappings: > > .long 1 > > mseal_exec_flags: > > push rbp > > mov rbp, rsp > > mov eax, DWORD PTR seal_system_mappings[rip] > > cmp eax, 1 > > jne .L2 > > mov eax, 2 > > jmp .L3 > > .L2: > > mov eax, 0 > > .L3: > > pop rbp > > ret > > main: > > push rbp > > mov rbp, rsp > > sub rsp, 16 > > mov QWORD PTR [rbp-8], 0 > > call mseal_exec_flags > > mov edx, eax > > mov rax, QWORD PTR [rbp-8] > > or eax, edx > > leave > > ret > > > > ----- 26 lines ----- > > > > So as you can see, there are less instructions in my version; there are > > 47.92% less lines of assembly. > > > vm_flags already run out of space in 32 bit, sooner or later we will > need to change that to *** a struct ***, passing address will be > becoming necessary with struct. Since this is not a performance > sensitive code path, 3 or 4 times during execve(), I think it would be > good to start here. No. I have pointed out why this is not 'the most efficient way' and you are now switching your argument to 'this will be needed in the future'. Please fix this. Liam