On 2022/5/28 11:13, Matthew Wilcox wrote: > On Sat, May 28, 2022 at 10:52:11AM +0800, Miaohe Lin wrote: >> On 2022/5/27 23:02, Matthew Wilcox wrote: >>> What? No. This can absolutely happen. We have a refcount on the folio, >>> which means that any other thread can temporarily raise the refcount, >> >> IIUC, the folio is only in the isolated page_list now and it's not in the page cache, swap cache, pagetable or >> under any use. So there should be no way that any other thread can temporarily raise the refcount when >> folio_ref_count == 1. Or am I miss something? > > Take a look at something like GUP (fast). If this page _was_ mapped to > userspace, something like this can happen: > > Thread A Thread B > load PTE > unmap page > refcount goes to 1 > vmscan sees the page > try_get_ref > refcount is now 2. WARN_ON. > > Thread A will see that the PTE has changed and will now drop its > reference, but Thread B already spat out the WARN. > > A similar thing can happen with the page cache. Oh, I see. Many thanks for your patient explanation! :) > > If this is a worthwhile optimisation (does it happen often that we find > a refcount == 1 page?), we could do something like ... No, It should be rare. > > if (folio_ref_freeze(folio, 1)) { > nr_pages = folio_nr_pages(folio); > goto free_it; > } > > ... or ... > > if (folio_ref_count(folio) == 1 && > folio_ref_freeze(folio, 1)) { > > ... if we want to test-and-test-and-clear These proposed code changes look good to me. > > But this function is far too complicated already. I really want to > see numbers that proves the extra complexity is worth it. This optimization can save lots of cpu cycles and avoid possible disk I/O in that specified case. But that is a somewhat rare case. So there's no numbers that proves the extra complexity is worth it. Should I drop this patch or proceed with the proposed code changes above in next version? :) Many thanks! > > > . >