On Sat, Apr 09, 2016 at 08:17:06AM +1000, Dave Chinner wrote: > On Fri, Apr 08, 2016 at 01:18:44PM -0400, Brian Foster wrote: > > On Fri, Apr 08, 2016 at 09:37:45AM +1000, Dave Chinner wrote: > > > Hi folks, > > > > > > This is the second version of this patch set, first posted and > > > described here: > > > > > > http://oss.sgi.com/archives/xfs/2016-04/msg00069.html > > > > > > The only change from the first version is splitting up the first > > > patch into two as Christoph requested - one for the bug fix, the > > > other for the variable renaming. > > > > > > > Did your xfstests testing for this series include generic/233? I'm > > seeing a consistently reproducible test hang. The test is hanging on a > > "xfs_quota -x -c off -ug /mnt/scratch" command. The stack is as follows: > > > > [<ffffffffa0772306>] xfs_qm_dquot_walk.isra.8+0x196/0x1b0 [xfs] > > [<ffffffffa0774a98>] xfs_qm_dqpurge_all+0x78/0x80 [xfs] > > [<ffffffffa07713e8>] xfs_qm_scall_quotaoff+0x148/0x640 [xfs] > > [<ffffffffa077733d>] xfs_quota_disable+0x3d/0x50 [xfs] > > [<ffffffff812c27e3>] SyS_quotactl+0x3b3/0x8c0 > > [<ffffffff81003e17>] do_syscall_64+0x67/0x190 > > [<ffffffff81763f7f>] return_from_SYSCALL_64+0x0/0x7a > > [<ffffffffffffffff>] 0xffffffffffffffff > > > > ... and it looks like the kernel is spinning somehow or another between > > inode reclaim and xfsaild: > > > > ... > > kworker/1:2-210 [001] ...1 895.750591: xfs_perag_get_tag: dev 253:3 agno 1 refcount 1 caller xfs_reclaim_inodes_ag [xfs] > > kworker/1:2-210 [001] ...1 895.750609: xfs_perag_put: dev 253:3 agno 1 refcount 0 caller xfs_reclaim_inodes_ag [xfs] > > kworker/1:2-210 [001] ...1 895.750609: xfs_perag_get_tag: dev 253:3 agno 2 refcount 5 caller xfs_reclaim_inodes_ag [xfs] > > kworker/1:2-210 [001] ...1 895.750611: xfs_perag_put: dev 253:3 agno 2 refcount 4 caller xfs_reclaim_inodes_ag [xfs] > > kworker/1:2-210 [001] ...1 895.750612: xfs_perag_get_tag: dev 253:3 agno 3 refcount 1 caller xfs_reclaim_inodes_ag [xfs] > > kworker/1:2-210 [001] ...1 895.750613: xfs_perag_put: dev 253:3 agno 3 refcount 0 caller xfs_reclaim_inodes_ag [xfs] > > xfsaild/dm-3-12406 [003] ...2 895.760588: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL > > xfsaild/dm-3-12406 [003] ...2 895.810595: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL > > xfsaild/dm-3-12406 [003] ...2 895.860586: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL > > xfsaild/dm-3-12406 [003] ...2 895.910596: xfs_ail_locked: dev 253:3 lip 0xffff8801f8e65d80 lsn 2/5709 type XFS_LI_QUOTAOFF flags IN_AIL > > ... > > No deadlock involving the AIL - it doesn't remove the > XFS_LI_QUOTAOFF from the AIL - the quota code committing the > quotaoff-end transactions is what removes that. IOWs, the dquot walk > has not completed, so quotaoff has not completed, so the > XFS_LI_QUOTAOFF is still in the AIL. > > IOWs, this looks like xfs_qm_dquot_walk() is skipping dquots because > xfs_qm_dqpurge is hitting this: > > xfs_dqlock(dqp); > if ((dqp->dq_flags & XFS_DQ_FREEING) || dqp->q_nrefs != 0) { > xfs_dqunlock(dqp); > return -EAGAIN; > } > > So that means we've got an inode that probably hasn't been > reclaimed, because the last thing that happens during reclaim is the > dquots are detatched from the inode and hence the reference counts > are dropped. > > > FWIW, this only occurs with patch 6 applied. The test and scratch > > devices are both 10GB lvm volumes formatted with mkfs defaults (v5). > > I can't see how patch 6 would prevent an inode from being reclaimed, > as all the changes occur *after* the reclaim decision has been made. > More investigation needed, I guess... > The attached diff addresses the problem for me. Feel free to fold it into the original patch. The regression test I had running failed with an OOM over the weekend. I hadn't seen that before, but then again I haven't seen this test run to completion on this system either due to the original problem. I'll restart it today with this hunk included. Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index a60db43..749689c 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -818,14 +818,15 @@ xfs_inode_set_reclaim_tag( STATIC void __xfs_inode_clear_reclaim( xfs_perag_t *pag, - xfs_inode_t *ip) + xfs_inode_t *ip, + xfs_ino_t ino) { pag->pag_ici_reclaimable--; if (!pag->pag_ici_reclaimable) { /* clear the reclaim tag from the perag radix tree */ spin_lock(&ip->i_mount->m_perag_lock); radix_tree_tag_clear(&ip->i_mount->m_perag_tree, - XFS_INO_TO_AGNO(ip->i_mount, ip->i_ino), + XFS_INO_TO_AGNO(ip->i_mount, ino), XFS_ICI_RECLAIM_TAG); spin_unlock(&ip->i_mount->m_perag_lock); trace_xfs_perag_clear_reclaim(ip->i_mount, pag->pag_agno, @@ -841,7 +842,7 @@ __xfs_inode_clear_reclaim_tag( { radix_tree_tag_clear(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino), XFS_ICI_RECLAIM_TAG); - __xfs_inode_clear_reclaim(pag, ip); + __xfs_inode_clear_reclaim(pag, ip, ip->i_ino); } /* @@ -1032,7 +1033,7 @@ reclaim: if (!radix_tree_delete(&pag->pag_ici_root, XFS_INO_TO_AGINO(ip->i_mount, ino))) ASSERT(0); - __xfs_inode_clear_reclaim(pag, ip); + __xfs_inode_clear_reclaim(pag, ip, ino); spin_unlock(&pag->pag_ici_lock); /*
_______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs