On Tue, 25 Jun 2024, Kefeng Wang wrote: > On 2024/6/25 13:04, Hugh Dickins wrote: > > Commit "mm: migrate: split folio_migrate_mapping()" drew attention to > > "Note, the folio_ref_freeze() is moved out of xas_lock_irq(), Since the > > folio is already isolated and locked during migration, so suppose that > > there is no functional change." > > > > That was a mistake. Freezing a folio's refcount to 0 is much like taking > > a spinlock: see how filemap_get_entry() takes rcu_read_lock() then spins > > around until the folio is unfrozen. If the task freezing is preempted (or > > calls cond_resched(), as folio_mc_copy() may do), then it risks deadlock: > > in my case, one CPU in zap_pte_range() with free_swap_and_cache_nr() > > trying to reclaim swap while PTL is held, all the other CPUs in reclaim > > spinning for that PTL. > > Oh, thanks for pointing this out, I didn't take that into account. > > > > I'm uncertain whether it's necessary for interrupts to be disabled as > > well as preemption, but since they have to be disabled for the page > > cache migration, it's much the best to do it all together as before. > > So revert to folio_ref_freeze() under xas_lock_irq(): but keep the > > preliminary folio_ref_count() check, which does make sense before > > trying to copy the folio's data. > > This is what my RFC version[1] does, which adds same reference check to > avoid the unnecessary folio_mc_copy(). > > Should I resend all patches, or Andrew directly pick this one? Andrew asks for a resend: I was only aiming to fix the bug, and have no perspective on how much of the series remains worthwhile. > > > Thanks. > > [1] > https://lore.kernel.org/linux-mm/20240129070934.3717659-7-wangkefeng.wang@xxxxxxxxxx/ > > > > > > Use "expected_count" for the expected count throughout. > > > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > > --- > > mm/migrate.c | 59 +++++++++++++++++++++++++--------------------------- > > 1 file changed, 28 insertions(+), 31 deletions(-) > > > > diff --git a/mm/migrate.c b/mm/migrate.c > > index 27f070f64f27..8beedbb42a93 100644 > > --- a/mm/migrate.c > > +++ b/mm/migrate.c > > @@ -400,8 +400,8 @@ static int folio_expected_refs(struct address_space > > *mapping, > > * 2 for folios with a mapping > > * 3 for folios with a mapping and PagePrivate/PagePrivate2 set. > > */ > > -static void __folio_migrate_mapping(struct address_space *mapping, > > - struct folio *newfolio, struct folio *folio, int expected_cnt) > > +static int __folio_migrate_mapping(struct address_space *mapping, > > + struct folio *newfolio, struct folio *folio, int > > expected_count) > > We can rename it back to folio_migrate_mapping(). Almost. I did want to remove the __ layer, but internally the last parameter is expected_count, whereas externally it is extra_count: each has merit in its place, so I left them as is. Hugh