Re: [PATCH 1/2] mm: close race between do_fault_around() and fault_around_bytes_set()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 29, 2014 at 03:36:57PM -0700, David Rientjes wrote:
> On Tue, 29 Jul 2014, Kirill A. Shutemov wrote:
> 
> > Things can go wrong if fault_around_bytes will be changed under
> > do_fault_around(): between fault_around_mask() and fault_around_pages().
> > 
> > Let's read fault_around_bytes only once during do_fault_around() and
> > calculate mask based on the reading.
> > 
> > Note: fault_around_bytes can only be updated via debug interface. Also
> > I've tried but was not able to trigger a bad behaviour without the
> > patch. So I would not consider this patch as urgent.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> > ---
> >  mm/memory.c | 17 +++++++++++------
> >  1 file changed, 11 insertions(+), 6 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 9d66bc66f338..7f4f0c41c9e9 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -2772,12 +2772,12 @@ static unsigned long fault_around_bytes = rounddown_pow_of_two(65536);
> >  
> >  static inline unsigned long fault_around_pages(void)
> >  {
> > -	return fault_around_bytes >> PAGE_SHIFT;
> > +	return ACCESS_ONCE(fault_around_bytes) >> PAGE_SHIFT;
> 
> Not sure why this is being added here, ACCESS_ONCE() would be needed 
> depending on the context in which the return value is used, 
> do_read_fault() won't need it.

Fair enough. I'll move it.

> >  }
> >  
> > -static inline unsigned long fault_around_mask(void)
> > +static inline unsigned long fault_around_mask(unsigned long nr_pages)
> >  {
> > -	return ~(fault_around_bytes - 1) & PAGE_MASK;
> > +	return ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK;
> >  }
> >  
> >  
> 
> This patch is corrupted because of the newline here that doesn't exist in 
> linux-next.

I'll recheck.

> > @@ -2844,12 +2844,17 @@ late_initcall(fault_around_debugfs);
> >  static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
> >  		pte_t *pte, pgoff_t pgoff, unsigned int flags)
> >  {
> > -	unsigned long start_addr;
> > +	unsigned long start_addr, nr_pages;
> >  	pgoff_t max_pgoff;
> >  	struct vm_fault vmf;
> >  	int off;
> >  
> > -	start_addr = max(address & fault_around_mask(), vma->vm_start);
> > +	nr_pages = fault_around_pages();
> > +	/* race with fault_around_bytes_set() */
> > +	if (unlikely(nr_pages <= 1))
> > +		return;
> 
> Why exactly is this unlikely if fault_around_bytes is tunable via debugfs 
> to equal PAGE_SIZE?  I assume we're expecting nobody is going to be doing 
> that, otherwise we'll hit the unlikely() branch here every time.

No. We hit do_fault_around() only after fault_around_pages() check in
do_read_fault(): so only in race case.

> So either the unlikely or the tunable should be removed.
> 
> The problem is that fault_around_bytes isn't documented so we don't even 
> know the min value without looking at the source code.

I would prefer to drop tunable, it will make code a bit simplier.
Andrew, iirc you've asked for it. Do you still think we need the handle?

> I also don't see how nr_pages can be < 1.

As Andrey has pointed, the 'if' is not needed.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]