On 13/02/2024 13:22, David Hildenbrand wrote: > On 13.02.24 14:20, Ryan Roberts wrote: >> On 13/02/2024 13:13, David Hildenbrand wrote: >>> On 13.02.24 14:06, Ryan Roberts wrote: >>>> On 13/02/2024 12:19, David Hildenbrand wrote: >>>>> On 13.02.24 13:06, Ryan Roberts wrote: >>>>>> On 12/02/2024 20:38, Ryan Roberts wrote: >>>>>>> [...] >>>>>>> >>>>>>>>>>> +static inline bool mm_is_user(struct mm_struct *mm) >>>>>>>>>>> +{ >>>>>>>>>>> + /* >>>>>>>>>>> + * Don't attempt to apply the contig bit to kernel mappings, >>>>>>>>>>> because >>>>>>>>>>> + * dynamically adding/removing the contig bit can cause page >>>>>>>>>>> faults. >>>>>>>>>>> + * These racing faults are ok for user space, since they get >>>>>>>>>>> serialized >>>>>>>>>>> + * on the PTL. But kernel mappings can't tolerate faults. >>>>>>>>>>> + */ >>>>>>>>>>> + return mm != &init_mm; >>>>>>>>>>> +} >>>>>>>>>> >>>>>>>>>> We also have the efi_mm as a non-user mm, though I don't think we >>>>>>>>>> manipulate >>>>>>>>>> that while it is live, and I'm not sure if that needs any special >>>>>>>>>> handling. >>>>>>>>> >>>>>>>>> Well we never need this function in the hot (order-0 folio) path, so I >>>>>>>>> think I >>>>>>>>> could add a check for efi_mm here with performance implication. It's >>>>>>>>> probably >>>>>>>>> safest to explicitly exclude it? What do you think? >>>>>>>> >>>>>>>> Oops: This should have read "I think I could add a check for efi_mm here >>>>>>>> *without* performance implication" >>>>>>> >>>>>>> It turns out that efi_mm is only defined when CONFIG_EFI is enabled. I >>>>>>> can do >>>>>>> this: >>>>>>> >>>>>>> return mm != &init_mm && (!IS_ENABLED(CONFIG_EFI) || mm != &efi_mm); >>>>>>> >>>>>>> Is that acceptable? This is my preference, but nothing else outside of efi >>>>>>> references this symbol currently. >>>>>>> >>>>>>> Or perhaps I can convince myself that its safe to treat efi_mm like >>>>>>> userspace. >>>>>>> There are a couple of things that need to be garanteed for it to be safe: >>>>>>> >>>>>>> - The PFNs of present ptes either need to have an associated struct >>>>>>> page or >>>>>>> need to have the PTE_SPECIAL bit set (either pte_mkspecial() or >>>>>>> pte_mkdevmap()) >>>>>>> >>>>>>> - Live mappings must either be static (no changes that could cause >>>>>>> fold/unfold >>>>>>> while live) or the system must be able to tolerate a temporary fault >>>>>>> >>>>>>> Mark suggests efi_mm is not manipulated while live, so that meets the latter >>>>>>> requirement, but I'm not sure about the former? >>>>>> >>>>>> I've gone through all the efi code, and conclude that, as Mark suggests, the >>>>>> mappings are indeed static. And additionally, the ptes are populated using >>>>>> only >>>>>> the _private_ ptep API, so there is no issue here. As just discussed with >>>>>> Mark, >>>>>> my prefereence is to not make any changes to code, and just add a comment >>>>>> describing why efi_mm is safe. >>>>>> >>>>>> Details: >>>>>> >>>>>> * Registered with ptdump >>>>>> * ptep_get_lockless() >>>>>> * efi_create_mapping -> create_pgd_mapping … -> init_pte: >>>>>> * __ptep_get() >>>>>> * __set_pte() >>>>>> * efi_memattr_apply_permissions -> efi_set_mapping_permissions … -> >>>>>> set_permissions >>>>>> * __ptep_get() >>>>>> * __set_pte() >>>>> >>>>> Sound good. We could add some VM_WARN_ON if we ever get the efi_mm via the >>>>> "official" APIs. >>>> >>>> We could, but that would lead to the same linkage issue, which I'm trying to >>>> avoid in the first place: >>>> >>>> VM_WARN_ON(IS_ENABLED(CONFIG_EFI) && mm == efi_mm); >>>> >>>> This creates new source code dependencies, which I would rather avoid if >>>> possible. >>> >>> Just a thought, you could have a is_efi_mm() function that abstracts all that. >>> >>> diff --git a/include/linux/efi.h b/include/linux/efi.h >>> index c74f47711f0b..152f5fa66a2a 100644 >>> --- a/include/linux/efi.h >>> +++ b/include/linux/efi.h >>> @@ -692,6 +692,15 @@ extern struct efi { >>> extern struct mm_struct efi_mm; >>> +static inline void is_efi_mm(struct mm_struct *mm) >>> +{ >>> +#ifdef CONFIG_EFI >>> + return mm == &efi_mm; >>> +#else >>> + return false; >>> +#endif >>> +} >>> + >>> static inline int >>> efi_guidcmp (efi_guid_t left, efi_guid_t right) >>> { >>> >>> >> >> That would definitely work, but in that case, I might as well just check for it >> in mm_is_user() (and personally I would change the name to mm_is_efi()): >> >> >> static inline bool mm_is_user(struct mm_struct *mm) >> { >> return mm != &init_mm && !mm_is_efi(mm); >> } >> >> Any objections? >> > > Nope :) Maybe slap in an "unlikely()", because efi_mm *is* unlikely to show up. Deal >