Re: Let's talk about the elephant in the room - the Linux kernel's inability to gracefully handle low memory pressure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/5/19 9:05 AM, Hillf Danton wrote:

On Sun, 4 Aug 2019 09:23:17 +0000 "Artem S. Tashkinov" <aros@xxxxxxx> wrote:
Hello,

There's this bug which has been bugging many people for many years
already and which is reproducible in less than a few minutes under the
latest and greatest kernel, 5.2.6. All the kernel parameters are set to
defaults.

Thanks for report!

Steps to reproduce:

1) Boot with mem=4G
2) Disable swap to make everything faster (sudo swapoff -a)
3) Launch a web browser, e.g. Chrome/Chromium or/and Firefox
4) Start opening tabs in either of them and watch your free RAM decrease

We saw another corner-case cpu hog report under memory pressure also
with swap disabled. In that report the xfs filesystem was an factor
with CONFIG_MEMCG enabled. Anything special, say like

  kernel:watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [leaker1:7193]
or
  [ 3225.313209] Xorg: page allocation failure: order:4, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0

in your kernel log?

I'm running ext4 only without LVM, encryption or anything like that.
Plain GPT/MBR partitions with plenty of free space and no disk errors.


Once you hit a situation when opening a new tab requires more RAM than
is currently available, the system will stall hard. You will barely  be
able to move the mouse pointer. Your disk LED will be flashing
incessantly (I'm not entirely sure why). You will not be able to run new
applications or close currently running ones.

A cpu hog may come on top of memory hog in some scenario.

It might have happened as well - I couldn't know since I wasn't able to
open a terminal. Once the system recovered there was no trace of
anything extraordinary.


This little crisis may continue for minutes or even longer. I think
that's not how the system should behave in this situation. I believe
something must be done about that to avoid this stall.

Yes, Sir.

I'm almost sure some sysctl parameters could be changed to avoid this
situation but something tells me this could be done for everyone and
made default because some non tech-savvy users will just give up on
Linux if they ever get in a situation like this and they won't be keen
or even be able to Google for solutions.

I am not willing to repeat that it is hard to produce a pill for all
patients, but the info you post will help solve the crisis sooner.

Hillf


In case you have troubles reproducing this bug report I can publish a VM
image - still everything is quite mundane: Fedora 30 + XFCE + web
browser. Nothing else, nothing fancy.

Regards,
Artem





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux