[regression, 3.0-rc] xfs: freeze hang in 068

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christoph,

The recent changes to the active transaction accounting to close a
race on freeze can hang the freeze process and hence the filesystem.

SysRq : Show Blocked State
  task                        PC stack   pid father
xfs_io          D ffff88005b7a0f00  5592 32539  32535 0x00020000
 ffff88005bc9dd88 0000000000000082 ffff88005bc9dd28 0000000000000296
 ffff88005bc9c010 ffff88005bff4fe0 0000000000010f80 ffff88005bc9dfd8
 ffff88005bc9dfd8 0000000000010f80 ffff88005cd1dfc0 ffff88005bff4fe0
Call Trace:
 [<ffffffff8150c184>] schedule_timeout+0x97/0xbb
 [<ffffffff81072766>] ? lock_timer_base+0x4d/0x4d
 [<ffffffff8150c1c1>] schedule_timeout_uninterruptible+0x19/0x1b
 [<ffffffff81269786>] xfs_quiesce_attr+0x1d/0x7f
 [<ffffffff81266bb2>] xfs_fs_freeze+0x20/0x2e
 [<ffffffff8110db00>] freeze_super+0x8b/0xca
 [<ffffffff81118abc>] do_vfs_ioctl+0x1d0/0x45c
 [<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
 [<ffffffff81102846>] ? virt_to_head_page+0x9/0x2c
 [<ffffffff81143cb0>] compat_sys_ioctl+0x33c/0x368
 [<ffffffff8110a0f3>] ? do_sys_open+0xee/0x100
 [<ffffffff81514960>] sysenter_dispatch+0x7/0x2e

This is waiting for mp->m_active_trans to reach zero.

fsstress        D 0000000000000000  5376 32541  32540 0x00020000
 ffff88005b68dd48 0000000000000086 ffff88005b68dce8 ffffffff812a97b7
 ffff88005b68c010 ffff88005cf8d7d0 0000000000010f80 ffff88005b68dfd8
 ffff88005b68dfd8 0000000000010f80 ffffffff81a0b020 ffff88005cf8d7d0
Call Trace:
 [<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
 [<ffffffff81080c73>] ? prepare_to_wait+0x71/0x7c
 [<ffffffff81262d5d>] xfs_file_aio_write+0x10a/0x245
 [<ffffffff81080a31>] ? wake_up_bit+0x25/0x25
 [<ffffffff8110b52b>] do_sync_write+0xc6/0x103
 [<ffffffff810eec9e>] ? handle_mm_fault+0xff/0x111
 [<ffffffff8110be64>] vfs_write+0xa9/0x105
 [<ffffffff8110b055>] ? vfs_llseek+0x2e/0x30
 [<ffffffff8110bf79>] sys_write+0x45/0x6c
 [<ffffffff81514960>] sysenter_dispatch+0x7/0x2e

This is waiting for the filesystem to unfreeze.

fsstress        D 0000000000000000  5040 32542  32540 0x00020000
 ffff88005b62fc78 0000000000000082 ffff88005b62fc18 ffffffff812a97b7
 ffff88005b62e010 ffff88005be9c000 0000000000010f80 ffff88005b62ffd8
 ffff88005b62ffd8 0000000000010f80 ffff88005cf8efa0 ffff88005be9c000
Call Trace:
 [<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
 [<ffffffff81080c73>] ? prepare_to_wait+0x71/0x7c
 [<ffffffff81255843>] _xfs_trans_alloc+0x89/0xee
 [<ffffffff81080a31>] ? wake_up_bit+0x25/0x25
 [<ffffffff8125875b>] xfs_trans_alloc+0x13/0x15
 [<ffffffff8125abcb>] xfs_change_file_space+0x1f9/0x2f0
 [<ffffffff81122dda>] ? mntput+0x21/0x23
 [<ffffffff81113e7c>] ? path_put+0x1d/0x21
 [<ffffffff81263301>] xfs_ioc_space+0xc2/0xd3
 [<ffffffff81208f92>] xfs_file_compat_ioctl+0x2e1/0x49b
 [<ffffffff81122cc8>] ? mntput_no_expire+0x50/0x141
 [<ffffffff81122dda>] ? mntput+0x21/0x23
 [<ffffffff8110efd8>] ? vfs_fstat+0x3b/0x45
 [<ffffffff81143b15>] compat_sys_ioctl+0x1a1/0x368
 [<ffffffff81514960>] sysenter_dispatch+0x7/0x2e

This has an active transaction reference (i.e. keeping
mp->m_active_trans > 0) and is waiting for the freeze to complete.

Basically the problem is this:

thread 1				freeze
					SB_FREEZE_WRITE
					sync_filesystem()
					SB_FREEZE_TRANS
					->freeze
xfs_trans_alloc
  atomic_inc(mp->m_active_trans)
  wait on (SB_FREEZE_TRANS)
					xfs_quiese_attr()
					  while (mp->m_active_trans > 0)
						delay(1);

So effective we cannot sleep waiting for SB_FREEZE_TRANS to go away
while holding an active transaction reference because the freeze
process does not set and check SB_FREEZE_TRANS/mp->m_active_trans
atomically.

I haven't put any thought into how to solve this problem yet, so I'd
suggest that at this late stage we need to revert 315fdfa (xfs: fix
filesystsem freeze race in xfs_trans_alloc) because the race it
fixes is far less critical (i.e. doesn't hang the filesystem) and
harder to hit than the regression introduced here.

I've reproduced this a coupe lof times now on a 1p/1.5GB x86_64
kernel/i686 userspace VM.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs


[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux