On Fri, 11 Nov 2016, Kirill A. Shutemov wrote: > On Fri, Nov 11, 2016 at 05:42:11PM +0530, Aneesh Kumar K.V wrote: > > > > doing this in do_set_pmd keeps this closer to where we set the pmd. Any > > reason you thing we should move it higher up the stack. We already do > > pte_alloc() at the same level for a non transhuge case in > > alloc_set_pte(). > > I vaguely remember Hugh mentioned deadlock of allocation under page-lock vs. > OOM-killer (or something else?). You remember well. It was indeed the OOM killer, but in particular due to the way it used to wait for a current victim to exit, and that exit could be delayed forever by the way munlock_vma_pages_all() goes to lock each page in a VM_LOCKED area - a pity if one of them is the page we hold locked while servicing a fault and need to allocate a pagetable. > > If the deadlock is still there it would be matter of making preallocation > unconditional to fix the issue. I think enough has changed at the OOM killer end that the deadlock is no longer there. I haven't kept up with all the changes made recently, but I think we no longer wait for a unique victim to exit before trying another (reaped mms set MMF_OOM_SKIP); and the OOM reaper skips over VM_LOCKED areas to avoid just such a deadlock. It's still silly that munlock_vma_pages_all() should require page lock on each of those pages; but neither Michal nor I have had time to revisit our attempts to relieve that requirement - mlock.c is not easy. > > But what you propose about doesn't make situation any worse. I'm fine with > that. Yes, I think that's right: if there is a problem, then it would already be problem since alloc_set_pte() was created; but we've seen no reports. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>