Re: [PATCH 2/3] mm,migration: Prevent rmap_walk_[anon|ksm] seeing the wrong VMA information

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 28, 2010 at 01:10:07AM +0200, Andrea Arcangeli wrote:
> On Tue, Apr 27, 2010 at 10:30:51PM +0100, Mel Gorman wrote:
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index f90ea92..61d6f1d 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -578,6 +578,9 @@ again:			remove_next = 1 + (end > next->vm_end);
> >  		}
> >  	}
> >  
> > +	if (vma->anon_vma)
> > +		spin_lock(&vma->anon_vma->lock);
> > +
> >  	if (root) {
> >  		flush_dcache_mmap_lock(mapping);
> >  		vma_prio_tree_remove(vma, root);
> > @@ -620,6 +623,9 @@ again:			remove_next = 1 + (end > next->vm_end);
> >  	if (mapping)
> >  		spin_unlock(&mapping->i_mmap_lock);
> >  
> > +	if (vma->anon_vma)
> > +		spin_unlock(&vma->anon_vma->lock);
> > +
> >  	if (remove_next) {
> >  		if (file) {
> >  			fput(file);
> 
> The old code did:
> 
>     /*
>      * When changing only vma->vm_end, we don't really need
>      * anon_vma lock.
>      */
>     if (vma->anon_vma && (insert || importer || start !=  vma->vm_start))
> 	anon_vma = vma->anon_vma;
>     if (anon_vma) {
>         spin_lock(&anon_vma->lock);
> 
> why did it become unconditional? (and no idea why it was removed)
> 

It became unconditional because I wasn't sure of the optimisation versus the
new anon_vma changes (doesn't matter, should have been safe). At the time
the patch was introduced, the bug looked like a race in VMA's in the list
having their details modified. I thought vma_address was returning -EFAULT
when it shouldn't and while this may still be possible, it wasn't the prime
cause of the bug.

The more important race was in execve between when a VMA got moved and the
page tables copied. The anon_vma locks are fine for the VMA move but the
page table copy happens later. What the patch did was alter the timing of
the race. rmap_walk() was finding the VMA of the new stack being set up by
exec, failing to lock it and backing off. By the time it would restart and
get back to that VMA, it was already moved making the bug simply harder to
reproduce because the race window was so small.

So, the VMA list does not appear to be messed up but there still needs
to be protection against modification of VMA details that are already on
the list. For that, the seq counter would have been enough and
lighter-weight than acquiring the anon_vma->lock every time in
vma_adjust().

I'll drop this patch again as the execve race looks the most important.

> But I'm not sure about this part.... this is really only a question, I
> may well be wrong, I just don't get it.
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]