On 2022-03-31 04:53, Christoph Hellwig wrote:
- page = vm_normal_page(vma, addr, pte);
+ page = vm_normal_lru_page(vma, addr, pte);
Why can't this deal with ZONE_DEVICE pages? It certainly has
nothing do with a LRU I think. In fact being able to have
stats that count say the number of device pages here would
probably be useful at some point.
Maybe at some point. However, this is in a function called
"can_gather_numa_stats". There are no meaningful NUMA stats for device
pages. I agree that the name "vm_normal_lru_page" is not optimal in this
case.
In general I find the vm_normal_lru_page vs vm_normal_page
API highly confusing. An explicit check for zone device pages
in the dozen or so spots that care has a much better documentation
value, especially if accompanied by comments where it isn't entirely
obvious.
OK. We can do that. It would solve the function naming problem, and we'd
have more visibility of device page handling in more places in the
kernel, which has educational value.
Regards,
Felix
page = follow_page(vma, addr,
- FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE);
+ FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE | FOLL_LRU);
Overly long line here.
+/*
+ * NOTE: Technically this should goto check_pfn label. However, page->_mapcount
+ * is never incremented for device pages that are mmap through DAX mechanism
+ * using pmem driver mounted into ext4 filesystem. When these pages are unmap,
+ * zap_pte_range is called and vm_normal_page return a valid page with
+ * page_mapcount() = 0, before page_remove_rmap is called.
+ */
Please properly indent comments.
+ * zone, as long as the pte's are present and vm_normal_lru_page() succeeds. These
* pages also get pinned.
Another overly long line here.