> >On Sat, Mar 16, 2024 at 10:59 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> >wrote: >> >> On Sat, Mar 16, 2024 at 04:53:09PM +0800, Zhaoyang Huang wrote: >> > On Fri, Mar 15, 2024 at 8:46 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> >wrote: >> > > >> > > On Thu, Mar 14, 2024 at 04:39:21PM +0800, zhaoyang.huang wrote: >> > > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx> >> > > > >> > > > Panic[1] reported which is caused by lruvec->list break. Fix the >> > > > race between folio_isolate_lru and release_pages. >> > > > >> > > > race condition: >> > > > release_pages could meet a non-refered folio which escaped from >> > > > being deleted from LRU but add to another list_head >> > > >> > > I don't think the bug is in folio_isolate_lru() but rather in its >> > > caller. >> > > >> > > * Context: >> > > * >> > > * (1) Must be called with an elevated refcount on the folio. This is a >> > > * fundamental difference from isolate_lru_folios() (which is called >> > > * without a stable reference). >> > > >> > > So when release_pages() runs, it must not see a refcount >> > > decremented to zero, because the caller of folio_isolate_lru() is supposed >to hold one. >> > > >> > > Your stack trace is for the thread which is calling >> > > release_pages(), not the one calling folio_isolate_lru(), so I can't help you >debug further. >> > Thanks for the comments. According to my understanding, >> > folio_put_testzero does the decrement before test which makes it >> > possible to have release_pages see refcnt equal zero and proceed >> > further(folio_get in folio_isolate_lru has not run yet). >> >> No, that's not possible. >> >> In the scenario below, at entry to folio_isolate_lru(), the folio has >> refcount 2. It has one refcount from thread 0 (because it must own >> one before calling folio_isolate_lru()) and it has one refcount from >> thread 1 (because it's about to call release_pages()). If >> release_pages() were not running, the folio would have refcount 3 when >> folio_isolate_lru() returned. >Could it be this scenario, where folio comes from pte(thread 0), local >fbatch(thread 1) and page cache(thread 2) concurrently and proceed >intermixed without lock's protection? Actually, IMO, thread 1 also could see the >folio with refcnt==1 since it doesn't care if the page is on the page cache or >not. > >madivise_cold_and_pageout does no explicit folio_get thing since the folio >comes from pte which implies it has one refcnt from pagecache > >#thread 0(madivise_cold_and_pageout) #1 >(lru_add_drain->fbatch_release_pages) >#2(read_pages->filemap_remove_folios) >refcnt == 1(represent page cache) > >refcnt==2(another one represent LRU) > folio comes from page cache >folio_isolate_lru >release_pages > filemap_free_folio > > > refcnt==1(decrease the one of page cache) > > folio_put_testzero == true > > <No lruvec_del_folio> > > list_add(folio->lru, pages_to_free) //current folio will break LRU's integrity >since it has not been deleted > >In case of gmail's wrap, split above chart to two parts > >#thread 0(madivise_cold_and_pageout) #1 >(lru_add_drain->fbatch_release_pages) >refcnt == 1(represent page cache) > >refcnt==2(another one represent LRU) >folio_isolate_lru >release_pages > > folio_put_testzero == true > > <No lruvec_del_folio> > > list_add(folio->lru, pages_to_free) > > //current folio will break LRU's integrity since it has not been deleted > >#1 (lru_add_drain->fbatch_release_pages) >#2(read_pages->filemap_remove_folios) >refcnt==2(another one represent LRU) > folio comes from page cache >release_pages > filemap_free_folio > > refcnt==1(decrease the one of page cache) >folio_put_testzero == true <No lruvec_del_folio> list_add(folio->lru, >pages_to_free) //current folio will break LRU's integrity since it has not been >deleted >> >> > #0 folio_isolate_lru #1 release_pages >> > BUG_ON(!folio_refcnt) >> > if (folio_put_testzero()) >> > folio_get(folio) >> > if (folio_test_clear_lru()) Resend the chart via outlook #thread 0(madivise_cold_and_pageout) #1(lru_add_drain->fbatch_release_pages) #2(read_pages->filemap_remove_folios) refcnt == 1(represent page cache) refcnt==2(another one represent LRU) folio comes from page cache folio_isolate_lru release_pages filemap_free_folio refcnt==1(decrease the one of page cache) folio_put_testzero == true <No lruvec_del_folio> list_add(folio->lru, pages_to_free) //current folio will break LRU's integrity since it has not been deleted