Re: XFS: 3-way deadlock with xfs_dquot, xfs_buf and xfs_inode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dave Chinner <david@xxxxxxxxxxxxx> 于2018年12月18日周二 上午7:33写道:
>
> On Sat, Dec 15, 2018 at 01:34:33PM +0800, 张本龙 wrote:
> > Hi Developpers and XFS,
> >
> > There seems to be a deadlock involving 3 threads: 1) the fsync thread
> > has acquired the project quota lock, and is trying to get the xfs_buf
> > (it's a an agf); 2) the xfs_buf is attached to a transaction, and
> > xfs_end_io is trying to get the xfs_inode ilock; 3) the write thread
> > has acquired the xfs_inode ilock, and tries to get the xfs_dquot.
> > Below are the traces.
>
> I don't see a deadlock here. What's holding the AGF lock and
> preventing progress from being made?
>

Oh, I was thinking the AGF is attached to a transaction. So between
xfs_trans_bjoin() and xfs_trans_commit(), a buf cannot be used by
others right? Then it should be released by xfs_end_io() in
xfs_trans_commit(), and the deadlock is like:

Thread          1                  2
         3
                   fsync()
                   dqlock P
                   agf lock
                   <blocks>
                                  xfs_end_io
                                  (agf locked by transaction)
                                  ilock A
                                  <blocks>
                                  unlock agf in trans commit

            write()

            ilock A

            dqlock P

            <blocks>

> i.e. we have:
>
> process         1               2               3
>         fsync()
>           ilock A
>           dqlock P
>             agf lock
>             <blocks>
>                                 xfs_end_io
>                                   ilock A
>                                   <blocks>
>                                                 write()
>                                                   ilock B
>                                                   dqlock P
>                                                   <blocks>
>
> So, basically, everyhting is waiting for the AGF lock, which
> is held by something other than these three threads. When the AGF
> lock is relesaed, the fsync() will make progress, then release
> both dqlock P and ilock A and the other two threads will make
> progress again.
>

Absolutely possible. fsync() indeed acquired ilock in
xfs_iomap_write_allocate(). Yes the key is who holds AGF in this
scenario.

But in either guess it seems the AGF is being held by a transaction,
blocked in xfs_end_io() by ilock.
> > It's 3.10.0-514.16.1.el7.x86_64 kernel, met about 10-20 times a week
> > on several hundred of servers.
>
> That's not a mainline kernel and so we can't really help diagnose
> this much further. You should really report it to your distro
> support channels so they can help you with further diagnosis and
> fixes...

Oh oh sure, thank you for pointing out. We don't have support channels
since we use CentOS...

I really appreciate your reply!
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux