> On 12 Dec 2019, at 20:47, Liran Alon <liran.alon@xxxxxxxxxx> wrote: > > > >> On 12 Dec 2019, at 20:22, Barret Rhoden <brho@xxxxxxxxxx> wrote: >> >> This change allows KVM to map DAX-backed files made of huge pages with >> huge mappings in the EPT/TDP. > > This change isn’t only relevant for TDP. It also affects when KVM use shadow-paging. > See how FNAME(page_fault)() calls transparent_hugepage_adjust(). > >> >> DAX pages are not PageTransCompound. The existing check is trying to >> determine if the mapping for the pfn is a huge mapping or not. > > I would rephrase “The existing check is trying to determine if the pfn > is mapped as part of a transparent huge-page”. > >> For >> non-DAX maps, e.g. hugetlbfs, that means checking PageTransCompound. > > This is not related to hugetlbfs but rather THP. > >> For DAX, we can check the page table itself. >> >> Note that KVM already faulted in the page (or huge page) in the host's >> page table, and we hold the KVM mmu spinlock. We grabbed that lock in >> kvm_mmu_notifier_invalidate_range_end, before checking the mmu seq. >> >> Signed-off-by: Barret Rhoden <brho@xxxxxxxxxx> > > I don’t think the right place to change for this functionality is transparent_hugepage_adjust() > which is meant to handle PFNs that are mapped as part of a transparent huge-page. > > For example, this would prevent mapping DAX-backed file page as 1GB. > As transparent_hugepage_adjust() only handles the case (level == PT_PAGE_TABLE_LEVEL). > > As you are parsing the page-tables to discover the page-size the PFN is mapped in, > I think you should instead modify kvm_host_page_size() to parse page-tables instead > of rely on vma_kernel_pagesize() (Which relies on vma->vm_ops->pagesize()) in case > of is_zone_device_page(). > The main complication though of doing this is that at this point you don’t yet have the PFN > that is retrieved by try_async_pf(). So maybe you should consider modifying the order of calls > in tdp_page_fault() & FNAME(page_fault)(). > > -Liran Or alternatively when thinking about it more, maybe just rename transparent_hugepage_adjust() to not be specific to THP and better handle the case of parsing page-tables changing mapping-level to 1GB. That is probably easier and more elegant. -Liran