Re: Lockup in wait_transaction_locked under memory pressure

Michal Hocko <mhocko@xxxxxxx> · Mon, 29 Jun 2015 11:16:30 +0200

On Mon 29-06-15 12:07:54, Nikolay Borisov wrote:
> 
> 
> On 06/29/2015 11:32 AM, Michal Hocko wrote:
> > On Thu 25-06-15 18:27:10, Nikolay Borisov wrote:
> >>
> >>
> >> On 06/25/2015 06:18 PM, Michal Hocko wrote:
> >>> On Thu 25-06-15 17:34:22, Nikolay Borisov wrote:
> >>>> On 06/25/2015 05:05 PM, Michal Hocko wrote:
> >>>>> On Thu 25-06-15 16:49:43, Nikolay Borisov wrote:
> >>>>> [...]
> >>>>>> How would you advise to rectify such situation?
> >>>>>
> >>>>> As I've said. Check the oom victim traces and see if it is holding any
> >>>>> of those locks.
> >>>>
> >>>> As mentioned previously all OOM traces are identical to the one I've
> >>>> sent - OOM being called form the page fault path.
> >>>  
> >>> By identical you mean that all of them kill the same task? Or just that
> >>> the path is same (which wouldn't be surprising as this is the only path
> >>> which triggers memcg oom killer)?
> >>
> >> The code path is the same, the tasks being killed are different
> > 
> > Is the OOM killer triggered only for a singe memcg or others misbehave
> > as well?
> 
> Generally OOM would be triggered for whichever memcg runs out of
> resources but so far I've only observed that the D state issue happens
> in a single containers.

It is not clear whether it is the OOM memcg which has tasks in the D
state. Anyway I think it all smells like one memcg is throttling others
on another shared resource - journal in your case.

> However, this in turn might affect other processes if they try to
> sleep on the same jbd2 journal .

Sure, if the journal is shared then this is an inherent problem. Memcg
restrictions can easily cause priority inheritance problems as Ted has
already mentioned.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html