On Mon, Dec 09, 2024 at 10:50:42AM -0500, Josef Bacik wrote: > As we've noticed in the upstream bug report for your initial work here, this > isn't quite correct, as we could have gotten a large folio in from userspace. I > think the better thing here is to do the page extraction, and then keep track of > the last folio we saw, and simply skip any folios that are the same for the > pages we have. This way we can handle large folios correctly. Thanks, Some people have in the past thought that they could skip subsequent page lookup if the folio they get back is large. This is an incorrect optimisation. Userspace may mmap() a file PROT_WRITE, MAP_PRIVATE. If they store to the middle of a large folio (the file that is mmaped may be on a filesystem that does support large folios, rather than fuse), then we'll have, eg: folio A page 0 folio A page 1 folio B page 0 folio A page 3 where folio A belongs to the file and folio B is an anonymous COW page.