Hi, > -----Original Message----- > From: David Rientjes [mailto:rientjes@xxxxxxxxxx] > Sent: Thursday, October 15, 2015 3:35 AM > To: PINTU KUMAR > Cc: akpm@xxxxxxxxxxxxxxxxxxxx; minchan@xxxxxxxxxx; dave@xxxxxxxxxxxx; > mhocko@xxxxxxx; koct9i@xxxxxxxxx; hannes@xxxxxxxxxxx; penguin-kernel@i- > love.sakura.ne.jp; bywxiaobai@xxxxxxx; mgorman@xxxxxxx; vbabka@xxxxxxx; > js1304@xxxxxxxxx; kirill.shutemov@xxxxxxxxxxxxxxx; > alexander.h.duyck@xxxxxxxxxx; sasha.levin@xxxxxxxxxx; cl@xxxxxxxxx; > fengguang.wu@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; > cpgs@xxxxxxxxxxx; pintu_agarwal@xxxxxxxxx; pintu.ping@xxxxxxxxx; > vishnu.ps@xxxxxxxxxxx; rohit.kr@xxxxxxxxxxx; c.rajkumar@xxxxxxxxxxx > Subject: RE: [RESEND PATCH 1/1] mm: vmstat: Add OOM victims count in vmstat > counter > > On Wed, 14 Oct 2015, PINTU KUMAR wrote: > > > For me it was very helpful during sluggish and long duration ageing tests. > > With this, I don't have to look into the logs manually. > > I just monitor this count in a script. > > The moment I get nr_oom_victims > 1, I know that kernel OOM would have > > happened and I need to take the log dump. > > So, then I do: dmesg >> oom_logs.txt > > Or, even stop the tests for further tuning. > > > > I think eventfd(2) was created for that purpose, to avoid the constant polling > that you would have to do to check nr_oom_victims and then take a snapshot. > > > > I disagree with this one, because we can encounter oom kills due to > > > fragmentation rather than low memory conditions for high-order allocations. > > > The amount of free memory may be substantially higher than all zone > > > watermarks. > > > > > AFAIK, kernel oom happens only for lower-order > (PAGE_ALLOC_COSTLY_ORDER). > > For higher-order we get page allocation failure. > > > > Order-3 is included. I've seen machines with _gigabytes_ of free memory in > ZONE_NORMAL on a node and have an order-3 page allocation failure that > called the oom killer. > Yes, if PAGE_ALLOC_COSTLY_ORDER is defined as 3, then order-3 will be included for OOM. But that's fine. We are just interested to know if system entered oom state. That's the reason, earlier I added even _oom_stall_ to know if system ever entered oom but resulted into page allocation failure instead of oom killing. > > > We've long had a desire to have a better oom reporting mechanism > > > rather than just the kernel log. It seems like you're feeling the > > > same pain. I think it > > would be > > > better to have an eventfd notifier for system oom conditions so we > > > can track kernel oom kills (and conditions) in userspace. I have a > > > patch for that, and > > it > > > works quite well when userspace is mlocked with a buffer in memory. > > > > > Ok, this would be interesting. > > Can you point me to the patches? > > I will quickly check if it is useful for us. > > > > https://lwn.net/Articles/589404. It's invasive and isn't upstream. I would like to > restructure that patchset to avoid the memcg trickery and allow for a root-only > eventfd(2) notification through procfs on system oom. I am interested only in global oom case and not memcg. We have memcg enabled but I think even memcg_oom will finally invoke _oom_kill_process_. So, I am interested in a patchset that can trigger notifications from oom_kill_process, as soon as any victim is killed. Sorry, from your patchset, I could not actually local the system_oom notification patch. If you have similar patchset please point me to it. It will be really helpful. Thank you! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>