On Tue, Dec 04, 2012 at 02:54:08PM +0200, Tommi Rantala wrote: > 2012/10/9 Mel Gorman <mgorman@xxxxxxx>: > > commit 00442ad04a5eac08a98255697c510e708f6082e2 upstream. > > > > Commit cc9a6c877661 ("cpuset: mm: reduce large amounts of memory barrier > > related damage v3") introduced a potential memory corruption. > > shmem_alloc_page() uses a pseudo vma and it has one significant unique > > combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED. > > > > get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL > > and mpol_cond_put() DOES decrease a policy ref when a policy has > > MPOL_F_SHARED. Therefore, when a cpuset update race occurs, > > alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the > > reference count and frees the policy prematurely. > > Hello, > > kmemleak is complaining about memory leaks that point to the mbind() > syscall. I've seen this only in v3.7-rcX, so I bisected this, and > found that this patch is the first mainline commit where I'm able to > reproduce it with Trinity. > Uncool. I'm writing this from an airport so am not in the position to test properly but at a glance I'm not seeing what drops the reference count taken by mpol_shared_policy_lookup() in all cases. vm_ops->get_policy() probably gets it right but what about shmem_alloc_page() and shmem_swapin()? This patch is only compile tested. If the reference counts are dropped somewhere I did not spot quickly then it'll cause a use-after-free bug instead but is worth trying anyway. diff --git a/mm/shmem.c b/mm/shmem.c index 89341b6..6229a43 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -912,6 +912,7 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp, { struct mempolicy mpol, *spol; struct vm_area_struct pvma; + struct page *page; spol = mpol_cond_copy(&mpol, mpol_shared_policy_lookup(&info->policy, index)); @@ -922,13 +923,19 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp, pvma.vm_pgoff = index + info->vfs_inode.i_ino; pvma.vm_ops = NULL; pvma.vm_policy = spol; - return swapin_readahead(swap, gfp, &pvma, 0); + page = swapin_readahead(swap, gfp, &pvma, 0); + + /* Drop reference taken by mpol_shared_policy_lookup() */ + mpol_cond_put(pvma.vm_policy); + + return page; } static struct page *shmem_alloc_page(gfp_t gfp, struct shmem_inode_info *info, pgoff_t index) { struct vm_area_struct pvma; + struct page *page; /* Create a pseudo vma that just contains the policy */ pvma.vm_start = 0; @@ -940,7 +947,12 @@ static struct page *shmem_alloc_page(gfp_t gfp, /* * alloc_page_vma() will drop the shared policy reference */ - return alloc_page_vma(gfp, &pvma, 0); + page = alloc_page_vma(gfp, &pvma, 0); + + /* Drop reference taken by mpol_shared_policy_lookup() */ + mpol_cond_put(pvma.vm_policy); + + return page; } #else /* !CONFIG_NUMA */ #ifdef CONFIG_TMPFS -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>