This patch maintains the number of oom victims kill count in /proc/vmstat. Currently, we are dependent upon kernel logs when the kernel OOM occurs. But kernel OOM can went passed unnoticed by the developer as it can silently kill some background applications/services. In some small embedded system, it might be possible that OOM is captured in the logs but it was over-written due to ring-buffer. Thus this interface can quickly help the user in analyzing, whether there were any OOM kill happened in the past, or whether the system have ever entered the oom kill stage till date. Thus, it can be beneficial under following cases: 1. User can monitor kernel oom kill scenario without looking into the kernel logs. 2. It can help in tuning the watermark level in the system. 3. It can help in tuning the low memory killer behavior in user space. 4. It can be helpful on a logless system or if klogd logging (/var/log/messages) are disabled. A snapshot of the result of 3 days of over night test is shown below: System: ARM Cortex A7, 1GB RAM, 8GB EMMC Linux: 3.10.xx Category: reference smart phone device Loglevel: 7 Conditions: Fully loaded, BT/WiFi/GPS ON Tests: auto launching of ~30+ apps using test scripts, in a loop for 3 days. At the end of tests, check: $ cat /proc/vmstat nr_oom_victims 6 As we noticed, there were around 6 oom kill victims. The OOM is bad for any system. So, this counter can help in quickly tuning the OOM behavior of the system, without depending on the logs. Signed-off-by: Pintu Kumar <pintu.k@xxxxxxxxxxx> --- V2: Removed oom_stall, Suggested By: Michal Hocko <mhocko@xxxxxxxxxx> Renamed oom_kill_count to nr_oom_victims, Suggested By: Michal Hocko <mhocko@xxxxxxxxxx> Suggested By: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx> include/linux/vm_event_item.h | 1 + mm/oom_kill.c | 2 ++ mm/page_alloc.c | 1 - mm/vmstat.c | 1 + 4 files changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 2b1cef8..dd2600d 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -57,6 +57,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #ifdef CONFIG_HUGETLB_PAGE HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL, #endif + NR_OOM_VICTIMS, UNEVICTABLE_PGCULLED, /* culled to noreclaim list */ UNEVICTABLE_PGSCANNED, /* scanned for reclaimability */ UNEVICTABLE_PGRESCUED, /* rescued from noreclaim list */ diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 03b612b..802b8a1 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -570,6 +570,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, * space under its control. */ do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true); + count_vm_event(NR_OOM_VICTIMS); mark_oom_victim(victim); pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n", task_pid_nr(victim), victim->comm, K(victim->mm->total_vm), @@ -600,6 +601,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, task_pid_nr(p), p->comm); task_unlock(p); do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true); + count_vm_event(NR_OOM_VICTIMS); } rcu_read_unlock(); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9bcfd70..fafb09d 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2761,7 +2761,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, schedule_timeout_uninterruptible(1); return NULL; } - /* * Go through the zonelist yet one more time, keep very high watermark * here, this is only to catch a parallel oom killing, we must fail if diff --git a/mm/vmstat.c b/mm/vmstat.c index 1fd0886..8503a2e 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -808,6 +808,7 @@ const char * const vmstat_text[] = { "htlb_buddy_alloc_success", "htlb_buddy_alloc_fail", #endif + "nr_oom_victims", "unevictable_pgs_culled", "unevictable_pgs_scanned", "unevictable_pgs_rescued", -- 1.7.9.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>