On Thu, Dec 16, 2010 at 10:38:47AM -0500, Christoph Hellwig wrote: > On Mon, Dec 13, 2010 at 03:32:19PM +1100, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > The xfaild often tries to rest to wait for congestion to pass of for > > IO to complete, but is regularly woken in tail-pushing situations. > > In severe cases, the xfsaild is getting woken tens of thousands of > > times a second. Reduce the number needless wakeups by only waking > > the xfsaild if the new target is larger than the old one. Further > > make short sleeps uninterruptible as they occur when the xfsaild has > > decided it needs to back off to allow some IO to complete and being > > woken early is counter-productive. > > This patch causes softlockup warnings in xfsaild for various testcases > on my 32-bit x86 VM, but the testcases continue otherwise normally. What tests? > Example below: > > [ 361.692515] INFO: task xfsaild/vdb5:8705 blocked for more than 120 seconds. > [ 361.697272] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 361.703929] xfsaild/vdb5 D 00000000 0 8705 2 0x00000000 > [ 361.708148] f4933f10 00000046 f4b37464 00000000 00000000 f4b37100 f4b37100 00000046 > [ 361.711501] f4933eb4 00000046 f4b37100 c0936092 f4b37264 f4b37268 00000000 c0d52d00 > [ 361.714786] c0d52d08 c0e96c00 f5735d38 f4933ec0 f4b37100 f6946c00 f4933eec c0160553 > [ 361.718120] Call Trace: > [ 361.721856] [<c0936092>] ? _raw_spin_unlock_irq+0x22/0x30 > [ 361.723439] [<c0160553>] ? finish_task_switch+0x73/0x100 > [ 361.725056] [<c0160517>] ? finish_task_switch+0x37/0x100 > [ 361.726592] [<c09334b3>] ? schedule+0x263/0x9d0 > [ 361.727932] [<c0198f4b>] ? trace_hardirqs_off+0xb/0x10 > [ 361.729548] [<c0933f05>] schedule_timeout+0x185/0x250 > [ 361.731258] [<c09360d5>] ? _raw_spin_unlock_irqrestore+0x35/0x60 > [ 361.733037] [<c019c68b>] ? trace_hardirqs_on+0xb/0x10 > [ 361.734513] [<c04ed504>] xfsaild+0x54/0xc0 > [ 361.735786] [<c04ed4b0>] ? xfsaild+0x0/0xc0 > [ 361.737171] [<c0187634>] kthread+0x74/0x80 > [ 361.738446] [<c01875c0>] ? kthread+0x0/0x80 > [ 361.739987] [<c013507a>] kernel_thread_helper+0x6/0x1c > [ 361.741589] no locks held by xfsaild/vdb5/8705. So this is saying is that a 20ms uninterruptible sleep lasting for more than 120s? Doesn't that imply some kind of scheduler starvation, not an actual XFS problem? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs