On Sun, Mar 17, 2024 at 12:07:40PM +0800, Zhaoyang Huang wrote: > Could it be this scenario, where folio comes from pte(thread 0), local > fbatch(thread 1) and page cache(thread 2) concurrently and proceed > intermixed without lock's protection? Actually, IMO, thread 1 also > could see the folio with refcnt==1 since it doesn't care if the page > is on the page cache or not. > > madivise_cold_and_pageout does no explicit folio_get thing since the > folio comes from pte which implies it has one refcnt from pagecache Mmm, no. It's implicit, but madvise_cold_or_pageout_pte_range() does guarantee that the folio has at least one refcount. Since we get the folio from vm_normal_folio(vma, addr, ptent); we know that there is at least one mapcount on the folio. refcount is always >= mapcount. Since we hold pte_offset_map_lock(), we know that mapcount (and therefore refcount) cannot be decremented until we call pte_unmap_unlock(), which we don't do until we have called folio_isolate_lru(). Good try though, took me a few minutes of looking at it to convince myself that it was safe. Something to bear in mind is that if the race you outline is real, failing to hold a refcount on the folio leaves the caller susceptible to the VM_BUG_ON_FOLIO(!folio_ref_count(folio), folio); if the other thread calls folio_put(). I can't understand any of the scenarios you outline below. Please try again without relying on indentation. > #thread 0(madivise_cold_and_pageout) #1 > (lru_add_drain->fbatch_release_pages) > #2(read_pages->filemap_remove_folios) > refcnt == 1(represent page cache) > > refcnt==2(another one represent LRU) > folio comes from page cache > folio_isolate_lru > release_pages > filemap_free_folio > > > refcnt==1(decrease the one of page cache) > > folio_put_testzero == true > > <No lruvec_del_folio> > > list_add(folio->lru, pages_to_free) //current folio will break LRU's > integrity since it has not been deleted > > In case of gmail's wrap, split above chart to two parts > > #thread 0(madivise_cold_and_pageout) #1 > (lru_add_drain->fbatch_release_pages) > refcnt == 1(represent page cache) > > refcnt==2(another one represent LRU) > folio_isolate_lru release_pages > > folio_put_testzero == true > > <No lruvec_del_folio> > > list_add(folio->lru, pages_to_free) > > //current folio will break LRU's integrity since it has not been > deleted > > #1 (lru_add_drain->fbatch_release_pages) > #2(read_pages->filemap_remove_folios) > refcnt==2(another one represent LRU) > folio comes from page cache > release_pages > filemap_free_folio > > refcnt==1(decrease the one of page cache) > folio_put_testzero == true > <No lruvec_del_folio> > list_add(folio->lru, pages_to_free) > //current folio will break LRU's integrity since it has not been deleted > > > > > #0 folio_isolate_lru #1 release_pages > > > BUG_ON(!folio_refcnt) > > > if (folio_put_testzero()) > > > folio_get(folio) > > > if (folio_test_clear_lru())