Re: [PATCH] mm/migrate: fix deadlock in migrate_pages_batch() on large folios

Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> · Mon, 29 Jul 2024 06:35:22 +0800

Hi,

On 2024/7/29 05:17, Matthew Wilcox wrote:
On Sun, Jul 28, 2024 at 12:50:05PM -0700, Andrew Morton wrote:
On Sun, 28 Jul 2024 23:49:13 +0800 Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
Currently, migrate_pages_batch() can lock multiple locked folios
with an arbitrary order.  Although folio_trylock() is used to avoid
deadlock as commit 2ef7dbb26990 ("migrate_pages: try migrate in batch
asynchronously firstly") mentioned, it seems try_split_folio() is
still missing.

Am I correct in believing that folio_lock() doesn't have lockdep coverage?

Yes.  It can't; it is taken in process context and released by whatever
context the read completion happens in (could be hard/soft irq, could be
a workqueue, could be J. Random kthread, depending on the device driver)
So it doesn't match the lockdep model at all.

It was found by compaction stress test when I explicitly enable EROFS
compressed files to use large folios, which case I cannot reproduce with
the same workload if large folio support is off (current mainline).
Typically, filesystem reads (with locked file-backed folios) could use
another bdev/meta inode to load some other I/Os (e.g. inode extent
metadata or caching compressed data), so the locking order will be:

Which kernels need fixing.  Do we expect that any code paths in 6.10 or
earlier are vulnerable to this?

I would suggest it goes back to the introduction of large folios, but
that's just a gut feeling based on absolutely no reading of code or
inspection of git history.

According to 5dfab109d519 ("migrate_pages: batch _unmap and _move"),
I think it's v6.3+.

Yet I don't have more time to look info all history of batching
migration, hoping Huang, Ying could give more hints on this.

Thanks,
Gao Xiang