On 13/02/2024 13:45, David Hildenbrand wrote: > On 13.02.24 14:33, Ard Biesheuvel wrote: >> On Tue, 13 Feb 2024 at 14:21, Ryan Roberts <ryan.roberts@xxxxxxx> wrote: >>> >>> On 13/02/2024 13:13, David Hildenbrand wrote: >>>> On 13.02.24 14:06, Ryan Roberts wrote: >>>>> On 13/02/2024 12:19, David Hildenbrand wrote: >>>>>> On 13.02.24 13:06, Ryan Roberts wrote: >>>>>>> On 12/02/2024 20:38, Ryan Roberts wrote: >>>>>>>> [...] >>>>>>>> >>>>>>>>>>>> +static inline bool mm_is_user(struct mm_struct *mm) >>>>>>>>>>>> +{ >>>>>>>>>>>> + /* >>>>>>>>>>>> + * Don't attempt to apply the contig bit to kernel mappings, >>>>>>>>>>>> because >>>>>>>>>>>> + * dynamically adding/removing the contig bit can cause page >>>>>>>>>>>> faults. >>>>>>>>>>>> + * These racing faults are ok for user space, since they get >>>>>>>>>>>> serialized >>>>>>>>>>>> + * on the PTL. But kernel mappings can't tolerate faults. >>>>>>>>>>>> + */ >>>>>>>>>>>> + return mm != &init_mm; >>>>>>>>>>>> +} >>>>>>>>>>> >>>>>>>>>>> We also have the efi_mm as a non-user mm, though I don't think we >>>>>>>>>>> manipulate >>>>>>>>>>> that while it is live, and I'm not sure if that needs any special >>>>>>>>>>> handling. >>>>>>>>>> >>>>>>>>>> Well we never need this function in the hot (order-0 folio) path, so I >>>>>>>>>> think I >>>>>>>>>> could add a check for efi_mm here with performance implication. It's >>>>>>>>>> probably >>>>>>>>>> safest to explicitly exclude it? What do you think? >>>>>>>>> >>>>>>>>> Oops: This should have read "I think I could add a check for efi_mm here >>>>>>>>> *without* performance implication" >>>>>>>> >>>>>>>> It turns out that efi_mm is only defined when CONFIG_EFI is enabled I >>>>>>>> can do >>>>>>>> this: >>>>>>>> >>>>>>>> return mm != &init_mm && (!IS_ENABLED(CONFIG_EFI) || mm != &efi_mm); >>>>>>>> >>>>>>>> Is that acceptable? This is my preference, but nothing else outside of efi >>>>>>>> references this symbol currently. >>>>>>>> >>>>>>>> Or perhaps I can convince myself that its safe to treat efi_mm like >>>>>>>> userspace. >>>>>>>> There are a couple of things that need to be garanteed for it to be safe: >>>>>>>> >>>>>>>> - The PFNs of present ptes either need to have an associated struct >>>>>>>> page or >>>>>>>> need to have the PTE_SPECIAL bit set (either pte_mkspecial() or >>>>>>>> pte_mkdevmap()) >>>>>>>> >>>>>>>> - Live mappings must either be static (no changes that could cause >>>>>>>> fold/unfold >>>>>>>> while live) or the system must be able to tolerate a temporary fault >>>>>>>> >>>>>>>> Mark suggests efi_mm is not manipulated while live, so that meets the >>>>>>>> latter >>>>>>>> requirement, but I'm not sure about the former? >>>>>>> >>>>>>> I've gone through all the efi code, and conclude that, as Mark suggests, the >>>>>>> mappings are indeed static. And additionally, the ptes are populated >>>>>>> using only >>>>>>> the _private_ ptep API, so there is no issue here. As just discussed with >>>>>>> Mark, >>>>>>> my prefereence is to not make any changes to code, and just add a comment >>>>>>> describing why efi_mm is safe. >>>>>>> >>>>>>> Details: >>>>>>> >>>>>>> * Registered with ptdump >>>>>>> * ptep_get_lockless() >>>>>>> * efi_create_mapping -> create_pgd_mapping … -> init_pte: >>>>>>> * __ptep_get() >>>>>>> * __set_pte() >>>>>>> * efi_memattr_apply_permissions -> efi_set_mapping_permissions … -> >>>>>>> set_permissions >>>>>>> * __ptep_get() >>>>>>> * __set_pte() >>>>>> >>>>>> Sound good. We could add some VM_WARN_ON if we ever get the efi_mm via the >>>>>> "official" APIs. >>>>> >>>>> We could, but that would lead to the same linkage issue, which I'm trying to >>>>> avoid in the first place: >>>>> >>>>> VM_WARN_ON(IS_ENABLED(CONFIG_EFI) && mm == efi_mm); >>>>> >>>>> This creates new source code dependencies, which I would rather avoid if >>>>> possible. >>>> >>>> Just a thought, you could have a is_efi_mm() function that abstracts all that. >>>> >>>> diff --git a/include/linux/efi.h b/include/linux/efi.h >>>> index c74f47711f0b..152f5fa66a2a 100644 >>>> --- a/include/linux/efi.h >>>> +++ b/include/linux/efi.h >>>> @@ -692,6 +692,15 @@ extern struct efi { >>>> >>>> extern struct mm_struct efi_mm; >>>> >>>> +static inline void is_efi_mm(struct mm_struct *mm) >>>> +{ >>>> +#ifdef CONFIG_EFI >>>> + return mm == &efi_mm; >>>> +#else >>>> + return false; >>>> +#endif >>>> +} >>>> + >>>> static inline int >>>> efi_guidcmp (efi_guid_t left, efi_guid_t right) >>>> { >>>> >>>> >>> >>> That would definitely work, but in that case, I might as well just check for it >>> in mm_is_user() (and personally I would change the name to mm_is_efi()): >>> >>> >>> static inline bool mm_is_user(struct mm_struct *mm) >>> { >>> return mm != &init_mm && !mm_is_efi(mm); >>> } >>> >>> Any objections? >>> >> >> Any reason not to use IS_ENABLED(CONFIG_EFI) in the above? The extern >> declaration is visible to the compiler, and any references should >> disappear before the linker could notice that efi_mm does not exist. >> > > Sure, as long as the linker is happy why not. I'll let Ryan mess with that :) I'm not sure if you are suggesting dropping the mm_is_efi() helper and just use IS_ENABLED(CONFIG_EFI) in mm_is_user() to guard efi_mm, or if you are suggesting using IS_ENABLED(CONFIG_EFI) in mm_is_efi() instead of the ifdefery? The former was what I did initially; It works great, but I didn't like that I was introducing a new code dependecy between efi and arm64 (nothing else outside of efi references efi_mm). So then concluded that it is safe to not worry about efi_mm (thanks for your confirmation). But then David wanted a VM_WARN check, which reintroduces the code dependency. So he suggested the mm_is_efi() helper to hide that... This is all starting to feel circular... Since I've jsut updated the code to do it David's way, I propose leaving it as is since nobody has strong feelings. > >> In any case, feel free to add >> >> Acked-by: Ard Biesheuvel <ardb@xxxxxxxxxx> Great thanks! > > Thanks for the review. >