Re: [PATCH 01/42] xfs: fix low space alloc deadlock

luminosity1999@xxxxxxxxx · Mon, 23 Dec 2024 17:43:12 +0800

Suppose there are two threads doing delayed allocation and both of them
trigger the btree splitting. Due to delayed allocation, thread A holds
the lock of AGF 1, thread B holds the lock of AGF 3:

agno(low -> high): 1  3
                   A  B

Then both of them initiate a btree split work(AS from A and BS from B), 
but due to memory pressure described by commit
c85007e2e3942da1f9361e4b5a9388ea3a8dcc5b ("xfs: don't use BMBT btree
split workers for IO completion"), the two works are processed by the 
rescuer thread, AS first, then BS. After trying each AG with the TRYLOCK
flag but failing all, AS retries with blocking allocation, until it
reaches AG 3, then deadlocks there: AGF 3 lock was held by thread B, but 
its split work BS was queued after AS and has no chance to be processed.
Locking order is followed, but still deadlock occurs.

We didn't actually trigger such deadlock, but encountered a similar
hungtask reported by commit c85007e2e3942da1f9361e4b5a9388ea3a8dcc5b
("xfs: don't use BMBT btree split workers for IO completion"). That
commit has fixed our hungtask. But while analyzing the vmcore, we found
there are some other btree split works queued on the rescuer thread,
initiated from delayed allocation threads, each of which has held an AGF
lock. That leads us to this patch and thinking about such a case.

So is this possible? It seems currently once queued into the rescuer
thread, a split work has no way to know whether there is another
concurrent split work initiator holding the lock it requires.