On Wed 13-04-16 21:23:13, Michal Hocko wrote: > On Wed 13-04-16 14:33:09, Tejun Heo wrote: > > Hello, Petr. > > > > (cc'ing Johannes) > > > > On Wed, Apr 13, 2016 at 11:42:16AM +0200, Petr Mladek wrote: > > ... > > > By other words, "memcg_move_char/2860" flushes a work. But it cannot > > > get flushed because one worker is blocked and another one could not > > > get created. All these operations are blocked by the very same > > > "memcg_move_char/2860". > > > > > > Note that also "systemd/1" is waiting for "cgroup_mutex" in > > > proc_cgroup_show(). But it seems that it is not in the main > > > cycle causing the deadlock. > > > > > > I am able to reproduce this problem quite easily (within few minutes). > > > There are often even more tasks waiting for the cgroups-related locks > > > but they are not causing the deadlock. > > > > > > > > > The question is how to solve this problem. I see several possibilities: > > > > > > + avoid using workqueues in lru_add_drain_all() > > > > > > + make lru_add_drain_all() killable and restartable > > > > > > + do not block fork() when lru_add_drain_all() is running, > > > e.g. using some lazy techniques like RCU, workqueues > > > > > > + at least do not block fork of workers; AFAIK, they have a limited > > > cgroups usage anyway because they are marked with PF_NO_SETAFFINITY > > > > > > > > > I am willing to test any potential fix or even work on the fix. > > > But I do not have that big insight into the problem, so I would > > > need some pointers. > > > > An easy solution would be to make lru_add_drain_all() use a > > WQ_MEM_RECLAIM workqueue. > > I think we can live without lru_add_drain_all() in the migration path. > We are talking about 4 pagevecs so 56 pages. The charge migration is wanted to say 56 * num_cpus of course. > racy anyway. What concerns me more is how all this is fragile. It sounds > just too easy to add a dependency on per-cpu sync work later and > reintroduce this issue which is quite hard to detect. > > Cannot we come up with something more robust? Or at least warn when we > try to use per-cpu workers with problematic locks held? > > Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe cgroups" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html