On Tue, 2009-03-24 at 12:17 +0100, Peter Zijlstra wrote: > > But I don't think we've seen a coherent description of what's actually > > _wrong_ with the current code. flush_cpu_workqueue() has been handling > > this case for many years with no problems reported as far as I know. > > Might be sheer luck, but afaik we did have some actual deadlocks due to > workqueue flushing -- a particular one I can remember was cpu-hotplug vs > cpufreq. Two cases are relevant here actually -- the recursion which hasn't ever shown up before, and a number of possible deadlocks of e.g. some people doing, effectively: rtnl_lock(); flush_scheduled_work(); rtnl_unlock(); vs. the linkwatch work that can, at this point in time, still be queued, and needs the rtnl as well. A little digging through git logs finds more references, e.g. commits f90d4118bacef87894621a3e8aba853fa0c89abc and fd781fa25c9e9c6fd1599df060b05e7c4ad724e5. Some others were fixed that I remember, but apparently without putting the lockdep report into the commit log. johannes
Attachment:
signature.asc
Description: This is a digitally signed message part