On Fri 07-12-18 08:16:15, Michal Hocko wrote: [...] > Memcg v1 indeed doesn't have any dirty IO throttling and this is a > poor's man workaround. We still do not have that AFAIK and I do not know > of an elegant way around that. Fortunatelly we shouldn't have that many > GFP_KERNEL | __GFP_ACCOUNT allocations under page lock and we can work > around this specific one quite easily. I haven't tested this yet but the > following should work > > diff --git a/mm/memory.c b/mm/memory.c > index 4ad2d293ddc2..59c98eeb0260 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -2993,6 +2993,16 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) > struct vm_area_struct *vma = vmf->vma; > vm_fault_t ret; > > + /* > + * Preallocate pte before we take page_lock because this might lead to > + * deadlocks for memcg reclaim which waits for pages under writeback. > + */ > + if (!vmf->prealloc_pte) { > + vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm>mm, vmf->address); > + if (!vmf->prealloc_pte) > + return VM_FAULT_OOM; > + } > + > ret = vma->vm_ops->fault(vmf); > if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY | > VM_FAULT_DONE_COW))) This is too eager to allocate pte even when it is not really needed. Jack has also pointed out that I am missing a write barrier. So here we go with an updated patch. This is essentially what fault around code does. diff --git a/mm/memory.c b/mm/memory.c index 4ad2d293ddc2..1a73d2d4659e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2993,6 +2993,17 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) struct vm_area_struct *vma = vmf->vma; vm_fault_t ret; + /* + * Preallocate pte before we take page_lock because this might lead to + * deadlocks for memcg reclaim which waits for pages under writeback. + */ + if (pmd_none(*vmf->pmd) && !vmf->prealloc_pte) { + vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm>mm, vmf->address); + if (!vmf->prealloc_pte) + return VM_FAULT_OOM; + smp_wmb(); /* See comment in __pte_alloc() */ + } + ret = vma->vm_ops->fault(vmf); if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY | VM_FAULT_DONE_COW))) -- Michal Hocko SUSE Labs