On Tue, Oct 17 2017, Shaohua Li wrote: > On Tue, Oct 17, 2017 at 04:04:52PM +1100, Neil Brown wrote: >> >> lockdep currently complains about a potential deadlock >> with sysfs access taking reconfig_mutex, and that >> waiting for a work queue to complete. >> >> The cause is inappropriate overloading of work-items >> on work-queues. >> >> We currently have two work-queues: md_wq and md_misc_wq. >> They service 5 different tasks: >> >> mddev->flush_work md_wq >> mddev->event_work (for dm-raid) md_misc_wq >> mddev->del_work (mddev_delayed_delete) md_misc_wq >> mddev->del_work (md_start_sync) md_misc_wq >> rdev->del_work md_misc_wq >> >> We need to call flush_workqueue() for md_start_sync and ->event_work >> while holding reconfig_mutex, but mustn't hold it when >> flushing mddev_delayed_delete or rdev->del_work. >> >> md_wq is a bit special as it has WQ_MEM_RECLAIM so it is >> best to leave that alone. >> >> So create a new workqueue, md_del_wq, and a new work_struct, >> mddev->sync_work, so we can keep two classes of work separate. >> >> md_del_wq and ->del_work are used only for destroying rdev >> and mddev. >> md_misc_wq is used for event_work and sync_work. >> >> Also document the purpose of each flush_workqueue() call. >> >> This removes the lockdep warning. > > I had the exactly same patch queued internally, Cool :-) > but the mdadm test suite still > shows lockdep warnning. I haven't time to check further. > The only other lockdep I've seen later was some ext4 thing, though I haven't tried the full test suite. I might have a look tomorrow. Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature