On Thu, Oct 24, 2019 at 05:19:35AM +0800, Yang Shi wrote: > We have usecase to use tmpfs as QEMU memory backend and we would like to > take the advantage of THP as well. But, our test shows the EPT is not > PMD mapped even though the underlying THP are PMD mapped on host. > The number showed by /sys/kernel/debug/kvm/largepage is much less than > the number of PMD mapped shmem pages as the below: > > 7f2778200000-7f2878200000 rw-s 00000000 00:14 262232 /dev/shm/qemu_back_mem.mem.Hz2hSf (deleted) > Size: 4194304 kB > [snip] > AnonHugePages: 0 kB > ShmemPmdMapped: 579584 kB > [snip] > Locked: 0 kB > > cat /sys/kernel/debug/kvm/largepages > 12 > > And some benchmarks do worse than with anonymous THPs. > > By digging into the code we figured out that commit 127393fbe597 ("mm: > thp: kvm: fix memory corruption in KVM with THP enabled") checks if > there is a single PTE mapping on the page for anonymous THP when > setting up EPT map. But, the _mapcount < 0 check doesn't fit to page > cache THP since every subpage of page cache THP would get _mapcount > inc'ed once it is PMD mapped, so PageTransCompoundMap() always returns > false for page cache THP. This would prevent KVM from setting up PMD > mapped EPT entry. > > So we need handle page cache THP correctly. However, when page cache > THP's PMD gets split, kernel just remove the map instead of setting up > PTE map like what anonymous THP does. Before KVM calls get_user_pages() > the subpages may get PTE mapped even though it is still a THP since the > page cache THP may be mapped by other processes at the mean time. > > Checking its _mapcount and whether the THP has PTE mapped or not. > Although this may report some false negative cases (PTE mapped by other > processes), it looks not trivial to make this accurate. > > With this fix /sys/kernel/debug/kvm/largepage would show reasonable > pages are PMD mapped by EPT as the below: > > 7fbeaee00000-7fbfaee00000 rw-s 00000000 00:14 275464 /dev/shm/qemu_back_mem.mem.SKUvat (deleted) > Size: 4194304 kB > [snip] > AnonHugePages: 0 kB > ShmemPmdMapped: 557056 kB > [snip] > Locked: 0 kB > > cat /sys/kernel/debug/kvm/largepages > 271 > > And the benchmarks are as same as anonymous THPs. > > Fixes: dd78fedde4b9 ("rmap: support file thp") > Signed-off-by: Yang Shi <yang.shi@xxxxxxxxxxxxxxxxx> > Reported-by: Gang Deng <gavin.dg@xxxxxxxxxxxxxxxxx> > Tested-by: Gang Deng <gavin.dg@xxxxxxxxxxxxxxxxx> > Suggested-by: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> 4.8+ Looks good to me. Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> -- Kirill A. Shutemov