Re: [PATCH 3.16, 4.4, 4.9] dm bufio: avoid sleeping while holding the dm_bufio lock

Ben Hutchings <ben@xxxxxxxxxxxxxxx> · Sat, 10 Nov 2018 23:14:44 +0000

On Wed, 2018-07-04 at 10:36 -0400, Mikulas Patocka wrote:
> This is backport of the upstream patch 
> 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc. It should be backported to 
> stable kernels because this performance problem was seen on the android 
> 4.4 kernel.

I've finally queued this up for 3.16, thanks.

Ben.

> commit 9ea61cac0b1ad0c09022f39fd97e9b99a2cfc2dc
> Author: Douglas Anderson <dianders@xxxxxxxxxxxx>
> Date:   Thu Nov 17 11:24:20 2016 -0800
> 
>     dm bufio: avoid sleeping while holding the dm_bufio lock
>     
>     We've seen in-field reports showing _lots_ (18 in one case, 41 in
>     another) of tasks all sitting there blocked on:
>     
>       mutex_lock+0x4c/0x68
>       dm_bufio_shrink_count+0x38/0x78
>       shrink_slab.part.54.constprop.65+0x100/0x464
>       shrink_zone+0xa8/0x198
>     
>     In the two cases analyzed, we see one task that looks like this:
>     
>       Workqueue: kverityd verity_prefetch_io
>     
>       __switch_to+0x9c/0xa8
>       __schedule+0x440/0x6d8
>       schedule+0x94/0xb4
>       schedule_timeout+0x204/0x27c
>       schedule_timeout_uninterruptible+0x44/0x50
>       wait_iff_congested+0x9c/0x1f0
>       shrink_inactive_list+0x3a0/0x4cc
>       shrink_lruvec+0x418/0x5cc
>       shrink_zone+0x88/0x198
>       try_to_free_pages+0x51c/0x588
>       __alloc_pages_nodemask+0x648/0xa88
>       __get_free_pages+0x34/0x7c
>       alloc_buffer+0xa4/0x144
>       __bufio_new+0x84/0x278
>       dm_bufio_prefetch+0x9c/0x154
>       verity_prefetch_io+0xe8/0x10c
>       process_one_work+0x240/0x424
>       worker_thread+0x2fc/0x424
>       kthread+0x10c/0x114
>     
>     ...and that looks to be the one holding the mutex.
>     
>     The problem has been reproduced on fairly easily:
>     0. Be running Chrome OS w/ verity enabled on the root filesystem
>     1. Pick test patch: http://crosreview.com/412360
>     2. Install launchBalloons.sh and balloon.arm from
>          http://crbug.com/468342
>        ...that's just a memory stress test app.
>     3. On a 4GB rk3399 machine, run
>          nice ./launchBalloons.sh 4 900 100000
>        ...that tries to eat 4 * 900 MB of memory and keep accessing.
>     4. Login to the Chrome web browser and restore many tabs
>     
>     With that, I've seen printouts like:
>       DOUG: long bufio 90758 ms
>     ...and stack trace always show's we're in dm_bufio_prefetch().
>     
>     The problem is that we try to allocate memory with GFP_NOIO while
>     we're holding the dm_bufio lock.  Instead we should be using
>     GFP_NOWAIT.  Using GFP_NOIO can cause us to sleep while holding the
>     lock and that causes the above problems.
>     
>     The current behavior explained by David Rientjes:
>     
>       It will still try reclaim initially because __GFP_WAIT (or
>       __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO.  This is the cause of
>       contention on dm_bufio_lock() that the thread holds.  You want to
>       pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
>       mutex that can be contended by a concurrent slab shrinker (if
>       count_objects didn't use a trylock, this pattern would trivially
>       deadlock).
>     
>     This change significantly increases responsiveness of the system while
>     in this state.  It makes a real difference because it unblocks kswapd.
>     In the bug report analyzed, kswapd was hung:
>     
>        kswapd0         D ffffffc000204fd8     0    72      2 0x00000000
>        Call trace:
>        [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
>        [<ffffffc00090b794>] __schedule+0x440/0x6d8
>        [<ffffffc00090bac0>] schedule+0x94/0xb4
>        [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
>        [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
>        [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
>        [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
>        [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
>        [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
>        [<ffffffc00030e578>] balance_pgdat+0x328/0x508
>        [<ffffffc00030eb7c>] kswapd+0x424/0x51c
>        [<ffffffc00023f06c>] kthread+0x10c/0x114
>        [<ffffffc000203dd0>] ret_from_fork+0x10/0x40
>     
>     By unblocking kswapd memory pressure should be reduced.
>     
>     Suggested-by: David Rientjes <rientjes@xxxxxxxxxx>
>     Reviewed-by: Guenter Roeck <linux@xxxxxxxxxxxx>
>     Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
>     Signed-off-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> 
> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> index 125aedc3875f..b5d3428d109e 100644
> --- a/drivers/md/dm-bufio.c
> +++ b/drivers/md/dm-bufio.c
> @@ -827,7 +827,8 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
>          * dm-bufio is resistant to allocation failures (it just keeps
>          * one buffer reserved in cases all the allocations fail).
>          * So set flags to not try too hard:
> -        *      GFP_NOIO: don't recurse into the I/O layer
> +        *      GFP_NOWAIT: don't wait; if we need to sleep we'll release our
> +        *                  mutex and wait ourselves.
>          *      __GFP_NORETRY: don't retry and rather return failure
>          *      __GFP_NOMEMALLOC: don't use emergency reserves
>          *      __GFP_NOWARN: don't print a warning in case of failure
> @@ -837,7 +838,7 @@ static struct dm_buffer *__alloc_buffer_wait_no_callback(struct dm_bufio_client
>          */
>         while (1) {
>                 if (dm_bufio_cache_size_latch != 1) {
> -                       b = alloc_buffer(c, GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
> +                       b = alloc_buffer(c, GFP_NOWAIT | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
>                         if (b)
>                                 return b;
>                 }
-- 
Ben Hutchings
Kids!  Bringing about Armageddon can be dangerous.  Do not attempt it
in your own home. - Terry Pratchett and Neil Gaiman, `Good Omens'

Attachment:
signature.asc

Description: This is a digitally signed message part