Re: Possible regression with cgroups in 3.11

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri 22-11-13 10:50:11, Markus Blank-Burian wrote:
> > Hmm interesting. Either the output is not complete (because there is no
> > is going offline message for memcg:ffffc9001e5c3000) or this happens
> > before offlining. And there is only one such a place.
> 
> There is indeed no offline for memcg:ffffc9001e5c3000, I checked that
> before sending the relevant part of the trace.
> 
> > mem_cgroup_force_empty which is called when somebody writes into
> > memory.force_empty file. That however doesn't match your previous
> > traces. Maybe yet another issue...
> >
> > Could you apply the patch bellow on top of what you have already?
> 
> I applied the patch and attached the whole trace log, but there is no
> new trace output from mem_cgroup_force_empty present.

Weird! There are no other call sites.

Anyway.
$ grep mem_cgroup_css_offline: trace | sed 's@.*is@@' | sort | uniq -c
    581  going offline now
    580  offline now

So there is entry to offline without finishing. I would assume it would be the one that got stuck but no:
$ grep mem_cgroup_css_offline: trace | sed 's@.*memcg:\([0-9a-f]*\) .*@\1@' | sort | uniq -c | sort -k1 -n | head -n1
      1 ffffc9001e085000

which is not our ffffc9001e2cf000 and it is even not a bit-flip different.
What might be interesting is that
$ grep -B1 ffffc9001e2cf000 trace | head -n2
    kworker/8:13-7244  [008] ....   546.743666: mem_cgroup_css_offline: memcg:ffffc9001e085000 is going offline now
     kworker/2:5-6494  [002] ....   620.277552: mem_cgroup_reparent_charges: memcg:ffffc9001e2cf000 u:4096 k:0 tasks:0

So it is the last offline before and it started quite some time ago
without finishing or looping on mem_cgroup_reparent_charges on usage>0.
Maybe it is stuck on some other blocking operation (you've said you have
the fix for too many workers applied, right?)

It would be interesting to find out whether this is a general pattern
and if yes then check what are the stack traces of the two workers.

Thanks for your patience!
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe cgroups" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux