On Mon, 1 Mar 2010, Balbir Singh wrote: > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -580,6 +580,44 @@ void clear_zonelist_oom(struct zonelist *zonelist, gfp_t gfp_mask) > > } > > > > /* > > + * Try to acquire the oom killer lock for all system zones. Returns zero if a > > + * parallel oom killing is taking place, otherwise locks all zones and returns > > + * non-zero. > > + */ > > +static int try_set_system_oom(void) > > +{ > > + struct zone *zone; > > + int ret = 1; > > + > > + spin_lock(&zone_scan_lock); > > + for_each_populated_zone(zone) > > + if (zone_is_oom_locked(zone)) { > > + ret = 0; > > + goto out; > > + } > > + for_each_populated_zone(zone) > > + zone_set_flag(zone, ZONE_OOM_LOCKED); > > +out: > > + spin_unlock(&zone_scan_lock); > > + return ret; > > +} > > Isn't this an overkill, if pagefault_out_of_memory() does nothing and > oom takes longer than anticipated, we might end up looping, no? > Aren't we better off waiting for OOM to finish and retry the > pagefault? > I agree, I can add schedule_timeout_uninterruptible(1) so we decrease the loop while waiting for the parallel oom kill to happen. It's not overkill because we want to avoid needlessly killing tasks when killing one will already free memory which is hopefully usable by the pagefault. This merely covers the race between a parallel oom kill calling out_of_memory() and setting TIF_MEMDIE for a task which would make the following out_of_memory() call in pagefault_out_of_memory() a no-op anyway. > And like Kame said the pagefault code in memcg is undergoing a churn, > we should revisit those parts later. I am yet to review that > patchset though. > Kame said earlier it would be no problem to rebase his memcg oom work on mmotm if my patches were merged. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>