[PATCHv2 1/1] mm: fix unproperly folio_put by changing API in read_pages

"zhaoyang.huang" <zhaoyang.huang@xxxxxxxxxx> · Mon, 1 Apr 2024 16:17:34 +0800

From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>

An VM_BUG_ON in step 9 of [1] could happen as the refcnt is dropped
unproperly during the procedure of read_pages()->readahead_folio->folio_put.
This is introduced by commit 9fd472af84ab ("mm: improve cleanup when
->readpages doesn't process all pages")'.

key steps of[1] in brief:
2'. Thread_truncate get folio to its local fbatch by find_get_entry in step 2
7'. Last refcnt remained which is not as expect as from alloc_pages
    but from thread_truncate's local fbatch in step 7
8'. Thread_reclaim succeed to isolate the folio by the wrong refcnt(not
    the value but meaning) in step 8
9'. Thread_truncate hit the VM_BUG_ON in step 9

[1]
Thread_readahead:
0. folio = filemap_alloc_folio(gfp_mask, 0);
       (refcount 1: alloc_pages)
1. ret = filemap_add_folio(mapping, folio, index + i, gfp_mask);
       (refcount 2: alloc_pages, page_cache)

Thread_truncate:
2. folio = find_get_entries(&fbatch_truncate);
       (refcount 3: alloc_pages, page_cache, fbatch_truncate))

Thread_readahead:
3. Then we call read_pages()
       First we call ->readahead() which for some reason stops early.
4. Then we call readahead_folio() which calls folio_put()
       (refcount 2: page_cache, fbatch_truncate)
5. Then we call folio_get()
       (refcount 3: page_cache, fbatch_truncate, read_pages_temp)
6. Then we call filemap_remove_folio()
       (refcount 2: fbatch_truncate, read_pages_temp)
7. Then we call folio_unlock() and folio_put()
       (refcount 1: fbatch_truncate)

Thread_reclaim:
8. collect the page from LRU and call shrink_inactive_list->isolate_lru_folios
        shrink_inactive_list
        {
            isolate_lru_folios
            {
               if (!folio_test_lru(folio)) //false
                   bail out;
               if (!folio_try_get(folio)) //false
                   bail out;
            }
         }
         (refcount 2: fbatch_truncate, reclaim_isolate)

9. call shrink_folio_list->__remove_mapping
        shrink_folio_list()
        {
            folio_try_lock(folio);
            __remove_mapping()
            {
                if (!folio_ref_freeze(2)) //false
                    bail out;
            }
            folio_unlock(folio)
            list_add(folio, free_folios);
        }
        (folio has refcount 0)

Thread_truncate:
10. Thread_truncate will hit the refcnt VM_BUG_ON(refcnt == 0) in
release_pages->folio_put_testzero
        truncate_inode_pages_range
        {
            folio = find_get_entries(&fbatch_truncate);
            truncate_inode_pages(&fbatch_truncate);
            folio_fbatch_release(&fbatch_truncate);
            {
                folio_put_testzero(folio); //VM_BUG_ON here
            }
        }

fix: commit 9fd472af84ab ("mm: improve cleanup when ->readpages doesn't process all pages")'

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
---
patch v2: update commit message
---
---
 mm/readahead.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/readahead.c b/mm/readahead.c
index 130c0e7df99f..657736200a92 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -163,7 +163,7 @@ static void read_pages(struct readahead_control *rac)
 		 * may be used to size the next readahead, so make sure
 		 * they accurately reflect what happened.
 		 */
-		while ((folio = readahead_folio(rac)) != NULL) {
+		while ((folio = __readahead_folio(rac)) != NULL) {
 			unsigned long nr = folio_nr_pages(folio);
 
 			folio_get(folio);
-- 
2.25.1