On Wed, May 22, 2024 at 11:50:06AM -0600, Alex Williamson wrote: > I'm not sure if there are any outstanding blockers on Peter's side, but > this seems like a good route from the vfio side. If we're seeing this > now without lockdep, we might need to bite the bullet and take the hit > with vmf_insert_pfn() while the pmd/pud path learn about pfnmaps. No immediate blockers, it's just that there're some small details that I may still need to look into. The current one TBD is pfn tracking implications on PAT. Here I see at least two issues to be investigated. Firstly, when vfio zap bars it can try to remove VM_PAT flag. To be explicit, unmap_single_vma() has: if (unlikely(vma->vm_flags & VM_PFNMAP)) untrack_pfn(vma, 0, 0, mm_wr_locked); I believe it'll also erase the entry on the memtype_rbroot.. I'm not sure whether that's correct at all, and if that's correct how we should re-inject that. So far I feel like we should keep that pfn tracking stuff alone from tearing down pgtables only, but I'll need to double check. E.g. I at least checked MADV_DONTNEED won't allow to apply on PFNMAPs, so vfio zapping the vma should be the 1st one can do that besides munmap(). The other thing is I just noticed very recently that the PAT bit on x86_64 is not always the same one.. on 4K it's bit 7, but it's reused as PSE on higher levels, moving PAT to bit 12: #define _PAGE_BIT_PSE 7 /* 4 MB (or 2MB) page */ #define _PAGE_BIT_PAT 7 /* on 4KB pages */ #define _PAGE_BIT_PAT_LARGE 12 /* On 2MB or 1GB pages */ We may need something like protval_4k_2_large() when injecting huge mappings. >From the schedule POV, the plan is I'll continue work on this after I flush the inbox for the past two weeks and when I'll get some spare time. Now ~160 emails left.. but I'm getting there. If there's comments for either of above, please shoot. Thanks, -- Peter Xu