Re: Quota-enabled XFS hangs during mount

Dave Chinner <david@xxxxxxxxxxxxx> · Sun, 29 Jan 2017 09:42:42 +1100

On Fri, Jan 27, 2017 at 12:07:34PM -0500, Brian Foster wrote:
> The problem looks like a race between dquot reclaim and quotacheck. The
> high level sequence of events is as follows:
> 
>  - During quotacheck, xfs_qm_dqiterate() walks the physical dquot
>    buffers and queues them to the delwri queue.
>  - Next, kswapd kicks in and attempts to reclaim a dquot that is backed
>    by a buffer on the quotacheck delwri queue. xfs_qm_dquot_isolate()
>    acquires the flush lock and attempts to queue to the reclaim delwri
>    queue. This silently fails because the buffer is already queued.
> 
>    From this point forward, the dquot flush lock is not going to be
>    released until the buffer is submitted for I/O and completed via
>    quotacheck.
>  - Quotacheck continues on to the xfs_qm_flush_one() pass, hits the
>    dquot in question and waits on the flush lock to issue the flush of
>    the recalculated values. *deadlock*
> 
> There are at least a few ways to deal with this. We could do something
> granular to fix up the reclaim path to check whether the buffer is
> already queued or something of that nature before we actually invoke the
> flush. I think this is effectively pointless, however, because the first
> part of quotacheck walks and queues all physical dquot buffers anyways.
> 
> In other words, I think dquot reclaim during quotacheck should probably
> be bypassed.
....

> Note that I think this does mean that you could still have low memory
> issues if you happen to have a lot of quotas defined..

Hmmm..... Really needs fixing.

I think submitting the buffer list after xfs_qm_dqiterate() and
waiting for completion will avoid this problem.

However, I suspect reclaim can still race with flushing, so we need
to detect "stuck" dquots, submit the delwri buffer queue and wait,
then flush the dquot again.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html