On Fri, 9 Aug 2024, David Hildenbrand wrote:
This really seems to be the latest point where we can "easily" back off and
unlock the source folio -- in this function :)
I wonder if we should be smarter in the migrate_pages_batch() loop when we
start the actual migrations via migrate_folio_move(): if we detect that a
folio has unexpected references *and* it has waiters (PG_waiters), back off
then and retry the folio later. If it only has unexpected references, just
keep retrying: no waiters -> nobody is waiting for the lock to make progress.
Well just backoff ASAP if there are waiters detected anytime. A waiter
would have increased the refcount. And a waiter will likely modify the page status soon. So
push it to the end of the pages to be migrated to give it as much time
as we can and check again later.