On Thu, Aug 22, 2019 at 08:55:28PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > Benjamin Moody reported to Debian that XFS partially wedges when a chgrp > fails on account of being out of disk quota. I ran his reproducer > script: > > # adduser dummy > # adduser dummy plugdev > > # dd if=/dev/zero bs=1M count=100 of=test.img > # mkfs.xfs test.img > # mount -t xfs -o gquota test.img /mnt > # mkdir -p /mnt/dummy > # chown -c dummy /mnt/dummy > # xfs_quota -xc 'limit -g bsoft=100k bhard=100k plugdev' /mnt > > (and then as user dummy) > > $ dd if=/dev/urandom bs=1M count=50 of=/mnt/dummy/foo > $ chgrp plugdev /mnt/dummy/foo > > and saw: > > ================================================ > WARNING: lock held when returning to user space! > 5.3.0-rc5 #rc5 Tainted: G W > ------------------------------------------------ > chgrp/47006 is leaving the kernel with locks still held! > 1 lock held by chgrp/47006: > #0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs] > > ...which is clearly caused by xfs_setattr_nonsize failing to unlock the > ILOCK after the xfs_qm_vop_chown_reserve call fails. Add the missing > unlock. > > Reported-by: benjamin.moody@xxxxxxxxx > Fixes: 253f4911f297 ("xfs: better xfs_trans_alloc interface") > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > --- > fs/xfs/xfs_iops.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c > index dd4076ae228a..ea614b4ae052 100644 > --- a/fs/xfs/xfs_iops.c > +++ b/fs/xfs/xfs_iops.c > @@ -804,6 +804,7 @@ xfs_setattr_nonsize( > > out_cancel: > xfs_trans_cancel(tp); > + xfs_iunlock(ip, XFS_ILOCK_EXCL); > out_dqrele: > xfs_qm_dqrele(udqp); > xfs_qm_dqrele(gdqp); /me goes back an looks at 253f4911f297 ("xfs: better xfs_trans_alloc interface") Fmeh. The original patch posting did: out_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); -out_trans_cancel: - xfs_trans_cancel(tp); +out_dqrele: xfs_qm_dqrele(udqp); xfs_qm_dqrele(gdqp); return error; Which leaked the transaction. Looks like I screwed up fixing that up on commit - it no longer leaked the transaction, but leaked the lock instead. And 3 and half years later someone notices it... Oh, gawd that code is so grotty! I started saying "maybe we should..." and then stopped when I realised just how much cleanup needs to be done to that function... The above patch fixes the issue, iso consider it: Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx> -- Dave Chinner david@xxxxxxxxxxxxx