On 1/13/15 2:46 PM, Tejun Heo wrote: > So, > > nr_workers == 15, > nr_idle == 0, > nr_running == 0, > > That means one worker must be playing the role of manager by executing > manage_workers() whic his also responsible for kicking off the > rescuers if it fails to create new workers in a short period of time. > The manager is identifier as the holder of pool->manager_arb and while > a manager is trying to creat a worker, pool->mayday_timer must be > armed continuously firing off every MAYDAY_INTERVAL summoning rescuers > to the pool, which should be visible through the pool_pwq->mayday_node > corresponding to the stalled pool being queued on wq->maydays. > > Can you post the full dump of the pool, wq and all kworkers? > > Thanks. > Just for mailing list archive posterity, Tejun thinks he's found the culprit in the workqueue code, I or he can follow up again when he has a patch ready to go. Thanks Tejun! -Eric _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs