Thanks Mark!! On 1/12/2022 5:08 PM, Mark Hemment wrote: >>>> +static int shmem_fadvise_dontneed(struct address_space *mapping, loff_t start, >>>> + loff_t end) >>>> +{ >>>> + int ret; >>>> + struct page *page; >>>> + LIST_HEAD(list); >>>> + struct writeback_control wbc = { >>>> + .sync_mode = WB_SYNC_NONE, >>>> + .nr_to_write = LONG_MAX, >>>> + .range_start = 0, >>>> + .range_end = LLONG_MAX, >>>> + .for_reclaim = 1, >>>> + }; >>>> + >>>> + if (!shmem_mapping(mapping)) >>>> + return -EINVAL; >>>> + >>>> + if (!total_swap_pages) >>>> + return 0; >>>> + >>>> + lru_add_drain(); >>>> + shmem_isolate_pages_range(mapping, start, end, &list); >>>> + >>>> + while (!list_empty(&list)) { >>>> + page = lru_to_page(&list); >>>> + list_del(&page->lru); >>>> + if (page_mapped(page)) >>>> + goto keep; >>>> + if (!trylock_page(page)) >>>> + goto keep; >>>> + if (unlikely(PageTransHuge(page))) { >>>> + if (split_huge_page_to_list(page, &list)) >>>> + goto keep; >>>> + } >>> I don't know the shmem code and the lifecycle of a shm-page, so >>> genuine questions; >>> When the try-lock succeeds, should there be a test for PageWriteback() >>> (page skipped if true)? Also, does page->mapping need to be tested >>> for NULL to prevent races with deletion from the page-cache? >> I failed to envisage it. I should have considered both these conditions >> here. BTW, I am just thinking about why we shouldn't use >> reclaim_pages(page_list) function here with an extra set_page_dirty() on >> a page that is isolated? It just call the shrink_page_list() where all >> these conditions are properly handled. What is your opinion here? > Should be possible to use reclaim_pages() (I haven't look closely). > It might actually be good to use this function, as will do some > congestion throttling. Although it will always try to unmap > pages (note: your page_mapped() test is 'unstable' as done without the > page locked), so might give behaviour you want to avoid. page_mapped can be true between isolate and then asking for reclaim of it through reclaim_pages(), and then can be unmapped there. Thanks for pointing it out. BTW, the posix_fadvise man pages[1] doesn't talk about any restrictions with the mapped pages. If so, Am I allowed to even unmap the pages when called FADV_DONTNEED on the file (agree for mapped, we can rely on MADV_DONTNEED too)? [1]https://man7.org/linux/man-pages/man2/posix_fadvise.2.html > Note: reclaim_pages() is already used for madvise(PAGEOUT). The shmem > code would need to prepare page(s) to help shrink_page_list() to make > progress (see madvise.c:madvise_cold_or_pageout_pte_range()). > > Taking a step back; is fadvise(DONTNEED) really needed/wanted? Yes, > you gave a usecase (which I cut from this thread in my earlier reply), > but I'm not familiar with various shmem uses to know if this feature > is needed. Someone else will need to answer this. Actually I needed this for the use case mentioned. And regarding the various use cases, I encountered that GEM buffers for display/graphics are using the shmem buffers. drivers/gpu/drm/i915/gem/i915_gem_shmem.c