Hello, On Thu, Oct 31, 2013 at 02:46:27PM -0700, Hugh Dickins wrote: > On Thu, 31 Oct 2013, Steven Rostedt wrote: > > On Wed, 30 Oct 2013 19:09:19 -0700 (PDT) > > Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > > > > This is, at least on the face of it, distinct from the workqueue > > > cgroup hang I was outlining to Tejun and Michal and Steve last week: > > > that also strikes in mem_cgroup_reparent_charges, but in the > > > lru_add_drain_all rather than in mem_cgroup_start_move: the > > > drain of pagevecs on all cpus never completes. > > > > > > > Did anyone ever run this code with lockdep enabled? There is lockdep > > annotation in the workqueue that should catch a lot of this. > > I believe I tried before, but I've just rechecked to be sure: > lockdep is enabled but silent when we get into that deadlock. Ooh... I just realized that work_on_cpu() explicitly opts out of flush lockdep verification by using __flush_work() to allow work_on_cpu() callback to use work_on_cpu() recursively. The commit is c2fda509667b ("workqueue: allow work_on_cpu() to be called recursively"). So, if we have an actual deadlock scenario involving work_on_cpu(), it may escape lockdep detection. I'll see if I can update the lockdep annotation so that it still allows recursive invocation but doesn't disable lockdep annotation completely. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html