> On Jun 18, 2019, at 11:30 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > On Tue, Jun 18, 2019 at 10:42 AM Nadav Amit <namit@xxxxxxxxxx> wrote: >>> On Jun 17, 2019, at 11:44 PM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: >>> >>> On Wed, Jun 12, 2019 at 9:59 PM Nadav Amit <namit@xxxxxxxxxx> wrote: >>>> Running some microbenchmarks on dax keeps showing find_next_iomem_res() >>>> as a place in which significant amount of time is spent. It appears that >>>> in order to determine the cacheability that is required for the PTE, >>>> lookup_memtype() is called, and this one traverses the resources list in >>>> an inefficient manner. This patch-set tries to improve this situation. >>> >>> Let's just do this lookup once per device, cache that, and replay it >>> to modified vmf_insert_* routines that trust the caller to already >>> know the pgprot_values. >> >> IIUC, one device can have multiple regions with different characteristics, >> which require difference cachability. > > Not for pmem. It will always be one common cacheability setting for > the entirety of persistent memory. > >> Apparently, that is the reason there >> is a tree of resources. Please be more specific about where you want to >> cache it, please. > > The reason for lookup_memtype() was to try to prevent mixed > cacheability settings of pages across different processes . The > mapping type for pmem/dax is established by one of: > > drivers/nvdimm/pmem.c:413: addr = > devm_memremap_pages(dev, &pmem->pgmap); > drivers/nvdimm/pmem.c:425: addr = > devm_memremap_pages(dev, &pmem->pgmap); > drivers/nvdimm/pmem.c:432: addr = devm_memremap(dev, > pmem->phys_addr, > drivers/nvdimm/pmem.c-433- pmem->size, > ARCH_MEMREMAP_PMEM); > > ...and is constant for the life of the device and all subsequent mappings. > >> Perhaps you want to cache the cachability-mode in vma->vm_page_prot (which I >> see being done in quite a few cases), but I don’t know the code well enough >> to be certain that every vma should have a single protection and that it >> should not change afterwards. > > No, I'm thinking this would naturally fit as a property hanging off a > 'struct dax_device', and then create a version of vmf_insert_mixed() > and vmf_insert_pfn_pmd() that bypass track_pfn_insert() to insert that > saved value. Thanks for the detailed explanation. I’ll give it a try (the moment I find some free time). I still think that patch 2/3 is beneficial, but based on your feedback, patch 3/3 should be dropped.