On Fri, Sep 7, 2018 at 2:25 AM Zhang Yi <yi.z.zhang@xxxxxxxxxxxxxxx> wrote: > > For device specific memory space, when we move these area of pfn to > memory zone, we will set the page reserved flag at that time, some of > these reserved for device mmio, and some of these are not, such as > NVDIMM pmem. > > Now, we map these dev_dax or fs_dax pages to kvm for DIMM/NVDIMM > backend, since these pages are reserved, the check of > kvm_is_reserved_pfn() misconceives those pages as MMIO. Therefor, we > introduce 2 page map types, MEMORY_DEVICE_FS_DAX/MEMORY_DEVICE_DEV_DAX, > to identify these pages are from NVDIMM pmem and let kvm treat these > as normal pages. > > Without this patch, many operations will be missed due to this > mistreatment to pmem pages, for example, a page may not have chance to > be unpinned for KVM guest(in kvm_release_pfn_clean), not able to be > marked as dirty/accessed(in kvm_set_pfn_dirty/accessed) etc. > > Signed-off-by: Zhang Yi <yi.z.zhang@xxxxxxxxxxxxxxx> > Acked-by: Pankaj Gupta <pagupta@xxxxxxxxxx> > --- > virt/kvm/kvm_main.c | 16 ++++++++++++++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index c44c406..9c49634 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -147,8 +147,20 @@ __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, > > bool kvm_is_reserved_pfn(kvm_pfn_t pfn) > { > - if (pfn_valid(pfn)) > - return PageReserved(pfn_to_page(pfn)); > + struct page *page; > + > + if (pfn_valid(pfn)) { > + page = pfn_to_page(pfn); > + > + /* > + * For device specific memory space, there is a case > + * which we need pass MEMORY_DEVICE_FS[DEV]_DAX pages > + * to kvm, these pages marked reserved flag as it is a > + * zone device memory, we need to identify these pages > + * and let kvm treat these as normal pages > + */ > + return PageReserved(page) && !is_dax_page(page); Should we consider just not setting PageReserved for devm_memremap_pages()? Perhaps kvm is not be the only component making these assumptions about this flag? Why is MEMORY_DEVICE_PUBLIC memory specifically excluded? This has less to do with "dax" pages and more to do with devm_memremap_pages() established ranges. P2PDMA is another producer of these pages. If either MEMORY_DEVICE_PUBLIC or P2PDMA pages can be used in these kvm paths then I think this points to consider clearing the Reserved flag. That said I haven't audited all the locations that test PageReserved(). Sorry for not responding sooner I was on extended leave.