Re: Possible mem cgroup bug in kernels between 4.18.0 and 5.3-rc1.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri 02-08-19 11:00:55, Masoud Sharbiani wrote:
> 
> 
> > On Aug 2, 2019, at 7:41 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > 
> > On Fri 02-08-19 07:18:17, Masoud Sharbiani wrote:
> >> 
> >> 
> >>> On Aug 2, 2019, at 12:40 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >>> 
> >>> On Thu 01-08-19 11:04:14, Masoud Sharbiani wrote:
> >>>> Hey folks,
> >>>> I’ve come across an issue that affects most of 4.19, 4.20 and 5.2 linux-stable kernels that has only been fixed in 5.3-rc1.
> >>>> It was introduced by
> >>>> 
> >>>> 29ef680 memcg, oom: move out_of_memory back to the charge path 
> >>> 
> >>> This commit shouldn't really change the OOM behavior for your particular
> >>> test case. It would have changed MAP_POPULATE behavior but your usage is
> >>> triggering the standard page fault path. The only difference with
> >>> 29ef680 is that the OOM killer is invoked during the charge path rather
> >>> than on the way out of the page fault.
> >>> 
> >>> Anyway, I tried to run your test case in a loop and leaker always ends
> >>> up being killed as expected with 5.2. See the below oom report. There
> >>> must be something else going on. How much swap do you have on your
> >>> system?
> >> 
> >> I do not have swap defined. 
> > 
> > OK, I have retested with swap disabled and again everything seems to be
> > working as expected. The oom happens earlier because I do not have to
> > wait for the swap to get full.
> > 
> 
> In my tests (with the script provided), it only loops 11 iterations before hanging, and uttering the soft lockup message.
> 
> 
> > Which fs do you use to write the file that you mmap?
> 
> /dev/sda3 on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)
> 
> Part of the soft lockup path actually specifies that it is going through __xfs_filemap_fault():

Right, I have just missed that.

[...]

> If I switch the backing file to a ext4 filesystem (separate hard drive), it OOMs.
> 
> 
> If I switch the file used to /dev/zero, it OOMs: 
> …
> Todal sum was 0. Loop count is 11
> Buffer is @ 0x7f2b66c00000
> ./test-script-devzero.sh: line 16:  3561 Killed                  ./leaker -p 10240 -c 100000
> 
> 
> > Or could you try to
> > simplify your test even further? E.g. does everything work as expected
> > when doing anonymous mmap rather than file backed one?
> 
> It also OOMs with MAP_ANON. 
> 
> Hope that helps.

It helps to focus more on the xfs reclaim path. Just to be sure, is
there any difference if you use cgroup v2? I do not expect to be but
just to be sure there are no v1 artifacts.
-- 
Michal Hocko
SUSE Labs



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux