On Wed, Dec 11, 2024 at 1:11 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Wed, Dec 11, 2024 at 01:04:45PM -0800, Joanne Koong wrote: > > On Mon, Dec 9, 2024 at 7:54 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > > > On Mon, Dec 09, 2024 at 10:50:42AM -0500, Josef Bacik wrote: > > > > As we've noticed in the upstream bug report for your initial work here, this > > > > isn't quite correct, as we could have gotten a large folio in from userspace. I > > > > think the better thing here is to do the page extraction, and then keep track of > > > > the last folio we saw, and simply skip any folios that are the same for the > > > > pages we have. This way we can handle large folios correctly. Thanks, > > > > > > Some people have in the past thought that they could skip subsequent > > > page lookup if the folio they get back is large. This is an incorrect > > > optimisation. Userspace may mmap() a file PROT_WRITE, MAP_PRIVATE. > > > If they store to the middle of a large folio (the file that is mmaped > > > may be on a filesystem that does support large folios, rather than > > > fuse), then we'll have, eg: > > > > > > folio A page 0 > > > folio A page 1 > > > folio B page 0 > > > folio A page 3 > > > > > > where folio A belongs to the file and folio B is an anonymous COW page. > > > > Sounds good, I'll fix this up in v3. Thanks. > > Hm? I didn't notice this bug in your code, just mentioning something > I've seen other people do and wanted to make suree you didn't. Did I > miss a bug in your code? Hi Matthew, I believe I'm doing this too in this patchset with these two lines: len = min_t(ssize_t, ret, folio_size(folio) - folio_offset); ... i += DIV_ROUND_UP(start + len, PAGE_SIZE); (where i is the index into the array of extracted pages) where I incorrectly assume the entire folio is contiguously represented in the next set of extracted pages so I just skip over those. Whereas what I need to do is check every page that was extracted to see if it does actually belong to the same folio as the previous page and adjust the length calculations accordingly. Thanks for flagging this.