On Wed, Sep 27, 2023 at 3:49 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > On Wed, Sep 27, 2023 at 11:08 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > On Wed, Sep 27, 2023 at 1:42 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > > > > On Wed, Sep 27, 2023 at 1:04 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > > > > > On Wed, Sep 27, 2023 at 8:08 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > > > On Wed, Sep 27, 2023 at 5:47 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > > > On Sat, Sep 23, 2023 at 3:31 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > > > > > + dst_pmdval = pmdp_get_lockless(dst_pmd); > > > > > > > + /* > > > > > > > + * If the dst_pmd is mapped as THP don't override it and just > > > > > > > + * be strict. If dst_pmd changes into TPH after this check, the > > > > > > > + * remap_pages_huge_pmd() will detect the change and retry > > > > > > > + * while remap_pages_pte() will detect the change and fail. > > > > > > > + */ > > > > > > > + if (unlikely(pmd_trans_huge(dst_pmdval))) { > > > > > > > + err = -EEXIST; > > > > > > > + break; > > > > > > > + } > > > > > > > + > > > > > > > + ptl = pmd_trans_huge_lock(src_pmd, src_vma); > > > > > > > + if (ptl && !pmd_trans_huge(*src_pmd)) { > > > > > > > + spin_unlock(ptl); > > > > > > > + ptl = NULL; > > > > > > > + } > > > > > > > > > > > > This still looks wrong - we do still have to split_huge_pmd() > > > > > > somewhere so that remap_pages_pte() works. > > > > > > > > > > Hmm, I guess this extra check is not even needed... > > > > > > > > Hm, and instead we'd bail at the pte_offset_map_nolock() in > > > > remap_pages_pte()? I guess that's unusual but works... > > > > > > Yes, that's what I was thinking but I agree, that seems fragile. Maybe > > > just bail out early if (ptl && !pmd_trans_huge())? > > > > No, actually we can still handle is_swap_pmd() case by splitting it > > and remapping the individual ptes. So, I can bail out only in case of > > pmd_devmap(). > > FWIW I only learned today that "real" swap PMDs don't actually exist - > only migration entries, which are encoded as swap PMDs, exist. You can > see that when you look through the cases that something like > __split_huge_pmd() or zap_pmd_range() actually handles. Ah, good point. > > So I think if you wanted to handle all the PMD types properly here > without splitting, you could do that without _too_ much extra code. > But idk if it's worth it. Yeah, I guess I can call pmd_migration_entry_wait() and retry by returning EAGAIN, similar to how remap_pages_pte() handles PTE migration. Looks simple enough. Thanks for all the pointers! I'll start cooking the next version.