On Tue, Feb 7, 2023 at 11:29 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > On Tue, Feb 07, 2023 at 10:37:06AM +0900, David Stevens wrote: > > On Tue, Feb 7, 2023 at 6:50 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > On Mon, Feb 06, 2023 at 03:52:19PM -0500, Peter Xu wrote: > > > > The problem is khugepaged will release pgtable lock during collapsing, so > > > > AFAICT there can be a race where some other thread tries to insert pages > > > > into page cache in parallel with khugepaged right after khugepaged released > > > > the page cache lock. > > > > > > > > For example, it seems to me new page cache can be inserted when khugepaged > > > > is copying small page content to the new hpage. > > > > This particular race can't happen with either patch, since the missing > > page cache entries are filled when we create the multi-index entry for > > hpage. > > Can too. > > for (index = start; index < end; index++) { > ... > if (xa_is_value(page) || !PageUptodate(page)) { > xas_unlock_irq(&xas); > /* swap in or instantiate fallocated page */ > if (shmem_get_folio(mapping->host, index, > &folio, SGP_NOALLOC)) { > result = SCAN_FAIL; > goto xa_unlocked; > } > ... > > So we start the iteration, and then a page fault happens in one of > the indices we've already examined, but we don't have the page on > the list. It's a nice wide race too because we're bringing the > page in from swap. > Ah, I misunderstood your objection to refer specifically to the copy_highpage loop at the end of collapse_file, rather than the entire process of the collapse. Yes, there can definitely be faults that fill in pages during the overall process of collapsing. However, my patch re-checks nr_none after acquiring the page cache lock for the final time, right before creating the hpage multi-index entry. Since present pages are locked, they can't be truncated after we've looked at them. That means that if nr_none still matches, then there weren't any page faults that populated missing pages. If we do detect any filled in pages, we can just roll back the collapse to avoid any problems. -David