Hello, Eric. On Fri, Jan 09, 2015 at 12:12:04PM -0600, Eric Sandeen wrote: > I had a case reported where a system under high stress > got deadlocked. A btree split was handed off to the xfs > allocation workqueue, and it is holding the xfs_ilock > exclusively. However, other xfs_end_io workers are > not running, because they are waiting for that lock. > As a result, the xfs allocation workqueue never gets > run, and everything grinds to a halt. I'm having a difficult time following the exact deadlock. Can you please elaborate in more detail? > To be honest, it's not clear to me how the workqueue > subsystem manages this sort of thing. But in testing, > making the allocation workqueue high priority so that > it gets added to the front of the pending work list, > resolves the problem. We did similar things for > the xfs-log workqueues, for similar reasons. Ummm, this feel pretty voodoo. In practice, it'd change the order of things being executed and may make certain deadlocks unlikely enough, but I don't think this can be a proper fix. > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c > index e5bdca9..9c549e1 100644 > --- a/fs/xfs/xfs_super.c > +++ b/fs/xfs/xfs_super.c > @@ -874,7 +874,7 @@ xfs_init_mount_workqueues( > goto out_destroy_log; > > mp->m_alloc_workqueue = alloc_workqueue("xfs-alloc/%s", > - WQ_MEM_RECLAIM|WQ_FREEZABLE, 0, mp->m_fsname); > + WQ_MEM_RECLAIM|WQ_FREEZABLE|WQ_HIGHPRI, 0, mp->m_fsname); And this at least deserves way more explanation. > if (!mp->m_alloc_workqueue) > goto out_destroy_eofblocks; > > Thanks. -- tejun _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs