Re: Memory cgroup invokes OOM killer when there are a lot of dirty pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> I assume dd just tried to fault a code page in and that failed due to
> the hard limit and unreclaimable memory. The reason why the memcg v1
> oom throttling heuristic hasn't kicked in is that there are no pages
> under writeback. This would match symptoms of the bug fixed by
> 1c610d5f93c7 ("mm/vmscan: wake up flushers for legacy cgroups too") in
> 4.16 but there might be more. You should have that fix already so there
> must be something more in the game. You've said that you are using blkio
> cgroup, right? What is the configuration? I strongly suspect that none
> of the writeback has started because of the throttling.

I'm only using a memory cgroup with no blkio restrictions so I'm not
sure why writeback hasn't started. Another thing I noticed is that
it's a lot harder to reproduce when the same amount of data is written
in a single file versus many smaller files. That's why my original
example code writes 500 files with 1MB of data.

Your mention of writeback gave me the idea to try and do a
sync_file_range() with SYNC_FILE_RANGE_WRITE after writing each file
to manually schedule writeback and surprisingly it fixed the problem.
Is that an indication of a bug in the kernel that doesn't trigger
writeback in time?

Also, you mentioned that the pagefault is probably due to a code page.
Would another remedy be to lock the whole executable and dynamic
libraries in memory with mlock() before starting the IO operations?

-- 
Petros Angelatos
CTO & Founder, Resin.io
BA81 DC1C D900 9B24 2F88  6FDD 4404 DDEE 92BF 1079




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux