On Tue, Jul 30, 2024 at 03:35:31PM +0930, Qu Wenruo wrote: > Hi, > > With recent btrfs attempt to utilize larger folios (for its metadata), I > am hitting a case like this: > > - Btrfs allocated an order 2 folio for metadata X > > - Btrfs tries to add the order 2 folio at filepos X > Then filemap_add_folio() returns -EEXIST for filepos X. > > - Btrfs tries to grab the existing metadata > Then filemap_lock_folio() returns -ENOENT for filepos X. > > The above case can have two causes: > > a) The folio at filepos X is released between add and lock > This is pretty rare, but still possible > > b) Some folios exist at range [X+4K, X+16K) > In my observation, this is way more common than case a). > > Case b) can be caused by the following situation: > > - There is an extent buffer at filepos X > And it is consisted of 4 order 0 folios. > > - vmscan wants to free folio at filepos X > It calls into the btrfs callback, btree_release_folio(). > And btrfs did all the checks, release the metadata. > > Now all the 4 folios at file pos [X, X+16K) have their private > flags cleared. > > - vmscan freed folio at filepos X > However the remaining 3 folios X+4K, X+8K, X+12K are still attached > to the filemap, and in theory we should free all 4 folios in one go. > > And later cause the conflicts with the larger folio we want to insert. > > I'm wondering if there is anyway to make sure we can release all > involved folios in one go? > I guess it will need a new callback, and return a list of folios to be > released? I feel like we're missing a few pieces of this puzzle: - Why did btrfs decide to create four order-0 folios in the first place? - Why isn't there an EEXIST fallback from order-2 to order-1 to order-0 folios? But there's no need for a new API. You can remove folios from the page cache whenever you like. See delete_from_page_cache_batch() as an example.