On Fri, Jun 05, 2009 at 12:19:07PM -0700, Dan Williams wrote: > One of the design goals was to prevent the occurrence of the > softlockup watchdog events which seem to trigger on large raid6 > resyncs. A per-cpu scheme would still require preempt_disable() while > the calculation is active, so perhaps we just need a call to > cond_resched() in raid5d to appease the scheduler. FWIW we added this to the patches shipped with Lustre: Index: linux-2.6.18-128.1.1/drivers/md/raid5.c =================================================================== --- linux-2.6.18-128.1.1.orig/drivers/md/raid5.c +++ linux-2.6.18-128.1.1/drivers/md/raid5.c @@ -2987,6 +2987,8 @@ static void raid5d (mddev_t *mddev) handle_stripe(sh, conf->spare_page); release_stripe(sh); + cond_resched(); + spin_lock_irq(&conf->device_lock); } PRINTK("%d stripes handled\n", handled); I thought most of these issues were gone in more recent kernels, but we haven't tested RAID on anything other than RHEL 4+5 extensively (Lustre doesn't support sufficiently new kernels yet.) Cheers, Jody -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html