On Fri 30-11-12 17:19:23, Michal Hocko wrote: [...] > The important question is why you see VM_FAULT_OOM and whether memcg > charging failure can trigger that. I don not see how this could happen > right now because __GFP_NORETRY is not used for user pages (except for > THP which disable memcg OOM already), file backed page faults (aka > __do_fault) use mem_cgroup_newpage_charge which doesn't disable OOM. > This is a real head scratcher. The following should print the traces when we hand over ENOMEM to the caller. It should catch all charge paths (migration is not covered but that one is not important here). If we don't see any traces from here and there is still global OOM striking then there must be something else to trigger this. Could you test this with the patch which aims at fixing your deadlock, please? I realise that this is a production environment but I do not see anything relevant in the code. --- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c8425b1..9e5b56b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2397,6 +2397,7 @@ done: return 0; nomem: *ptr = NULL; + __WARN(); return -ENOMEM; bypass: *ptr = NULL; -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>