On Thu, 8 Mar 2012, Andrew Morton wrote: > > It closes the risk of livelock if an oom killed thread, thread A, cannot > > exit because it's blocked on another thread, thread B, which cannot exit > > because it requires memory in the exit path and doesn't have access to > > memory reserves. So this patch makes it more likely that an oom killed > > thread will be able to exit without livelocking. > > But it also "allow to eat all of reserve memory and bring us new > serious failure". In theory, at least. > Exactly, "in theory." We've never seen an issue where a set of threads in do_exit() allocated memory at the same time to deplete all memory reserves while never freeing the memory so that reclaim consistently fails and all threads continue to enter into the oom killer to get access to memory reserves. And, with the way the code is written before this patch, only one thread will have access to memory reserves and the oom killer will be a no-op until it exits. There's a much higher liklihood that an oom killed thread may not exit because it's blocked on another thread that requires memory. That's what this patch addresses. > And afaict the proposed patch is a theoretical thing as well. Has > anyone sat down and created tests to demonstrate either problem? We've run with this patch internally for a year because an oom killed thread can't exit. We used to address this with an oom killer timeout that would kill another thread only after 10s but it was much faster to just give access to memory reserves and to let them exit. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>