On 13/02/2024 12:19, David Hildenbrand wrote: > On 13.02.24 13:06, Ryan Roberts wrote: >> On 12/02/2024 20:38, Ryan Roberts wrote: >>> [...] >>> >>>>>>> +static inline bool mm_is_user(struct mm_struct *mm) >>>>>>> +{ >>>>>>> + /* >>>>>>> + * Don't attempt to apply the contig bit to kernel mappings, because >>>>>>> + * dynamically adding/removing the contig bit can cause page faults. >>>>>>> + * These racing faults are ok for user space, since they get serialized >>>>>>> + * on the PTL. But kernel mappings can't tolerate faults. >>>>>>> + */ >>>>>>> + return mm != &init_mm; >>>>>>> +} >>>>>> >>>>>> We also have the efi_mm as a non-user mm, though I don't think we manipulate >>>>>> that while it is live, and I'm not sure if that needs any special handling. >>>>> >>>>> Well we never need this function in the hot (order-0 folio) path, so I think I >>>>> could add a check for efi_mm here with performance implication. It's probably >>>>> safest to explicitly exclude it? What do you think? >>>> >>>> Oops: This should have read "I think I could add a check for efi_mm here >>>> *without* performance implication" >>> >>> It turns out that efi_mm is only defined when CONFIG_EFI is enabled. I can do >>> this: >>> >>> return mm != &init_mm && (!IS_ENABLED(CONFIG_EFI) || mm != &efi_mm); >>> >>> Is that acceptable? This is my preference, but nothing else outside of efi >>> references this symbol currently. >>> >>> Or perhaps I can convince myself that its safe to treat efi_mm like userspace. >>> There are a couple of things that need to be garanteed for it to be safe: >>> >>> - The PFNs of present ptes either need to have an associated struct page or >>> need to have the PTE_SPECIAL bit set (either pte_mkspecial() or >>> pte_mkdevmap()) >>> >>> - Live mappings must either be static (no changes that could cause >>> fold/unfold >>> while live) or the system must be able to tolerate a temporary fault >>> >>> Mark suggests efi_mm is not manipulated while live, so that meets the latter >>> requirement, but I'm not sure about the former? >> >> I've gone through all the efi code, and conclude that, as Mark suggests, the >> mappings are indeed static. And additionally, the ptes are populated using only >> the _private_ ptep API, so there is no issue here. As just discussed with Mark, >> my prefereence is to not make any changes to code, and just add a comment >> describing why efi_mm is safe. >> >> Details: >> >> * Registered with ptdump >> * ptep_get_lockless() >> * efi_create_mapping -> create_pgd_mapping … -> init_pte: >> * __ptep_get() >> * __set_pte() >> * efi_memattr_apply_permissions -> efi_set_mapping_permissions … -> >> set_permissions >> * __ptep_get() >> * __set_pte() > > Sound good. We could add some VM_WARN_ON if we ever get the efi_mm via the > "official" APIs. We could, but that would lead to the same linkage issue, which I'm trying to avoid in the first place: VM_WARN_ON(IS_ENABLED(CONFIG_EFI) && mm == efi_mm); This creates new source code dependencies, which I would rather avoid if possible.