Re: [resend][PATCH 4/4] oom: don't ignore rss in nascent mm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On 11/25, Oleg Nesterov wrote:
> >
> > Great! I'll send the patch tomorrow.
> >
> > Even if you prefer another fix for 2.6.37/stable, I'd like to see
> > your review to know if it is correct or not (for backporting).
> 
> OK, what do you think about the patch below?

Great. Thanks a lot.


> 
> Seems to work, with this patch the test-case doesn't kill the
> system (sysctl_oom_kill_allocating_task == 0).
> 
> I didn't dare to change !CONFIG_MMU case, I do not know how to
> test it.
> 
> The patch is not complete, compat_copy_strings() needs changes.
> But, shouldn't it use get_arg_page() too? Otherwise, where do
> we check RLIMIT_STACK?
> 

Because NOMMU doesn't have variable length argv. Instead it is still
using MAX_ARG_STRLEN as old MMU code.

32 pages hard coded argv limitation naturally prevent this nascent mm
issue.


> The patch asks for the cleanups. In particular, I think exec_mmap()
> should accept bprm, not mm. But I'd prefer to do this later.
> 
> Oleg.

General request. Please consider to keep Brad's reported-by tag.


> 
>  include/linux/binfmts.h |    1 +
>  fs/exec.c               |   28 ++++++++++++++++++++++++++--
>  2 files changed, 27 insertions(+), 2 deletions(-)
> 
> --- K/include/linux/binfmts.h~acct_exec_mem	2010-08-19 11:35:00.000000000 +0200
> +++ K/include/linux/binfmts.h	2010-11-25 20:19:33.000000000 +0100
> @@ -33,6 +33,7 @@ struct linux_binprm{
>  # define MAX_ARG_PAGES	32
>  	struct page *page[MAX_ARG_PAGES];
>  #endif
> +	unsigned long vma_pages;
>  	struct mm_struct *mm;
>  	unsigned long p; /* current top of mem */
>  	unsigned int
> --- K/fs/exec.c~acct_exec_mem	2010-11-25 15:16:56.000000000 +0100
> +++ K/fs/exec.c	2010-11-25 20:20:49.000000000 +0100
> @@ -162,6 +162,25 @@ out:
>    	return error;
>  }
>  
> +static void acct_arg_size(struct linux_binprm *bprm, unsigned long pages)

Please move this function into #ifdef CONFIG_MMU. nommu code doesn't use it.

> +{
> +	struct mm_struct *mm = current->mm;
> +	long diff = pages - bprm->vma_pages;

I prefer to cast signed before assignment. It's safer more.


> +
> +	if (!mm || !diff)
> +		return;
> +
> +	bprm->vma_pages += diff;
> +
> +#ifdef SPLIT_RSS_COUNTING
> +	add_mm_counter(mm, MM_ANONPAGES, diff);
> +#else
> +	spin_lock(&mm->page_table_lock);
> +	add_mm_counter(mm, MM_ANONPAGES, diff);
> +	spin_unlock(&mm->page_table_lock);
> +#endif

OK, looks good.


> +}
> +
>  #ifdef CONFIG_MMU
>  
>  static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
> @@ -186,6 +205,8 @@ static struct page *get_arg_page(struct 
>  		unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
>  		struct rlimit *rlim;
>  
> +		acct_arg_size(bprm, size / PAGE_SIZE);
> +
>  		/*
>  		 * We've historically supported up to 32 pages (ARG_MAX)
>  		 * of argument strings even with small stacks
> @@ -1003,6 +1024,7 @@ int flush_old_exec(struct linux_binprm *
>  	/*
>  	 * Release all of the old mmap stuff
>  	 */
> +	acct_arg_size(bprm, 0);

Why do we need this unacct here? I mean 1) if exec_mmap() is success,
we don't need unaccount at all 2) if exec_mmap() is failure, an epilogue of
do_execve() does unaccount thing.


>  	retval = exec_mmap(bprm->mm);
>  	if (retval)
>  		goto out;
> @@ -1426,8 +1448,10 @@ int do_execve(const char * filename,
>  	return retval;
>  
>  out:
> -	if (bprm->mm)
> -		mmput (bprm->mm);
> +	if (bprm->mm) {
> +		acct_arg_size(bprm, 0);
> +		mmput(bprm->mm);
> +	}
>  
>  out_file:
>  	if (bprm->file) {
> 



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]