Re: [PATCH] binder_alloc: Move alloc_page() out of mmap_rwsem to reduce the lock duration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 03, 2024 at 10:50:09AM +1200, Barry Song wrote:
> From: Barry Song <v-songbaohua@xxxxxxxx>
> 
> The mmap_write_lock() can block all access to the VMAs, for example page
> faults. Performing memory allocation while holding this lock may trigger
> direct reclamation, leading to others being queued in the rwsem for an
> extended period.
> We've observed that the allocation can sometimes take more than 300ms,
> significantly blocking other threads. The user interface sometimes
> becomes less responsive as a result. To prevent this, let's move the
> allocation outside of the write lock.
> A potential side effect could be an extra alloc_page() for the second
> thread executing binder_install_single_page() while the first thread
> has done it earlier. However, according to Tangquan's 48-hour profiling
> using monkey, the likelihood of this occurring is minimal, with a ratio
> of only 1 in 2400. Compared to the significantly costly rwsem, this is
> negligible.
> On the other hand, holding a write lock without making any VMA
> modifications appears questionable and likely incorrect. While this
> patch focuses on reducing the lock duration, future updates may aim
> to eliminate the write lock entirely.
> 
> Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> Cc: "Arve Hjønnevåg" <arve@xxxxxxxxxxx>
> Cc: Todd Kjos <tkjos@xxxxxxxxxxx>
> Cc: Martijn Coenen <maco@xxxxxxxxxxx>
> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> Cc: Christian Brauner <brauner@xxxxxxxxxx>
> Cc: Carlos Llamas <cmllamas@xxxxxxxxxx>
> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> Tested-by: Tangquan Zheng <zhengtangquan@xxxxxxxx>
> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
> ---
>  drivers/android/binder_alloc.c | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index b3acbc4174fb..f20074e23a7c 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -227,13 +227,23 @@ static int binder_install_single_page(struct binder_alloc *alloc,
>  	if (!mmget_not_zero(alloc->mm))
>  		return -ESRCH;
>  
> +	/*
> +	 * Don't allocate page in mmap_write_lock, this can block
> +	 * mmap_rwsem for a long time; Meanwhile, allocation failure
> +	 * doesn't necessarily need to return -ENOMEM, if lru_page
> +	 * has been installed, we can still return 0(success).
> +	 */
> +	page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);

But now you are allocating new pages even if binder_get_installed_page()
is an error, right?  Doesn't that slow things down?

How was this benchmarked?

> +
>  	/*
>  	 * Protected with mmap_sem in write mode as multiple tasks
>  	 * might race to install the same page.
>  	 */
>  	mmap_write_lock(alloc->mm);
> -	if (binder_get_installed_page(lru_page))
> +	if (binder_get_installed_page(lru_page)) {
> +		ret = 1;

That is not a valid error value :(

>  		goto out;
> +	}
>  
>  	if (!alloc->vma) {
>  		pr_err("%d: %s failed, no vma\n", alloc->pid, __func__);
> @@ -241,7 +251,6 @@ static int binder_install_single_page(struct binder_alloc *alloc,
>  		goto out;
>  	}
>  
> -	page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
>  	if (!page) {
>  		pr_err("%d: failed to allocate page\n", alloc->pid);
>  		ret = -ENOMEM;
> @@ -252,7 +261,6 @@ static int binder_install_single_page(struct binder_alloc *alloc,
>  	if (ret) {
>  		pr_err("%d: %s failed to insert page at offset %lx with %d\n",
>  		       alloc->pid, __func__, addr - alloc->buffer, ret);
> -		__free_page(page);
>  		ret = -ENOMEM;
>  		goto out;
>  	}
> @@ -262,7 +270,9 @@ static int binder_install_single_page(struct binder_alloc *alloc,
>  out:
>  	mmap_write_unlock(alloc->mm);
>  	mmput_async(alloc->mm);
> -	return ret;
> +	if (ret && page)
> +		__free_page(page);
> +	return ret < 0 ? ret : 0;

Please only use ? : for when you have to, otherwise please spell it out
with a normal if statement:
	if (ret < 0)
		return ret;
	return 0;

But, this is abusing the fact that you set "ret = 1" above, which is
going to trip someone up in the future as that is NOT a normal coding
pattern we have in the kernel, sorry.

If you insist on this change, please rework it to not have that type of
"positive means one thing, 0 means another, and negative means yet
something else" please.

thanks,

greg k-h




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux