On Wed, Aug 14, 2013 at 11:28 AM, Michal Hocko <mhocko@xxxxxxx> wrote: > > OK that would suggest the issue has been introduced by 597e1c35: > (mm/mmu_gather: enable tlb flush range in generic mmu_gather) in 3.6 > which is not 3.7 when Ben started seeing the issue but this definitely > smells like a bug that would be amplified by the bisected patch. Yes, the bug was originally introduced in 597e1c35, but in practice it never happened, because the force_flush case would not ever really trigger unless __get_free_pages(GFP_NOWAIT) returned NULL. Which is *very* rare. So the commit that Ben bisected things down to wasn't the one that really introduced the bug, but it was the one that made tlb_next_batch() much more likely to return failure, which in turn made it much easier to *expose* the bug. NOTE! I still absolutely want Ben to actually test that fix (ie backport commit e6c495a96ce0 to his tree), because without testing this is all just theoretical, and there might be other things hiding here. But it makes sense to me, and I think this already-known bug explains the symptoms. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>