+ mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Subject: + mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting.patch added to -mm tree
To: hannes@xxxxxxxxxxx,guz.fnst@xxxxxxxxxxxxxx,isimatu.yasuaki@xxxxxxxxxxxxxx,stable@xxxxxxxxxxxxxxx,tangchen@xxxxxxxxxxxxxx
From: akpm@xxxxxxxxxxxxxxxxxxxx
Date: Thu, 05 Jun 2014 14:39:14 -0700


The patch titled
     Subject: mm: vmscan: clear kswapd's special reclaim powers before exiting
has been added to the -mm tree.  Its filename is
     mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/SubmitChecklist when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Johannes Weiner <hannes@xxxxxxxxxxx>
Subject: mm: vmscan: clear kswapd's special reclaim powers before exiting

When kswapd exits, it can end up taking locks that were previously held by
allocating tasks while they waited for reclaim.  Lockdep currently warns
about this:

On Wed, May 28, 2014 at 06:06:34PM +0800, Gu Zheng wrote:
> [ 2457.683370] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
> [ 2457.761540] kswapd2/1151 [HC0[0]:SC0[0]:HE1:SE1] takes:
> [ 2457.824102]  (&sig->group_rwsem){+++++?}, at: [<ffffffff81071864>] exit_signals+0x24/0x130
> [ 2457.923538] {RECLAIM_FS-ON-W} state was registered at:
> [ 2457.985055]   [<ffffffff810bfc99>] mark_held_locks+0xb9/0x140
> [ 2458.053976]   [<ffffffff810c1e3a>] lockdep_trace_alloc+0x7a/0xe0
> [ 2458.126015]   [<ffffffff81194f47>] kmem_cache_alloc_trace+0x37/0x240
> [ 2458.202214]   [<ffffffff812c6e89>] flex_array_alloc+0x99/0x1a0
> [ 2458.272175]   [<ffffffff810da563>] cgroup_attach_task+0x63/0x430
> [ 2458.344214]   [<ffffffff810dcca0>] attach_task_by_pid+0x210/0x280
> [ 2458.417294]   [<ffffffff810dcd26>] cgroup_procs_write+0x16/0x20
> [ 2458.488287]   [<ffffffff810d8410>] cgroup_file_write+0x120/0x2c0
> [ 2458.560320]   [<ffffffff811b21a0>] vfs_write+0xc0/0x1f0
> [ 2458.622994]   [<ffffffff811b2bac>] SyS_write+0x4c/0xa0
> [ 2458.684618]   [<ffffffff815ec3c0>] tracesys+0xdd/0xe2
> [ 2458.745214] irq event stamp: 49
> [ 2458.782794] hardirqs last  enabled at (49): [<ffffffff815e2b56>] _raw_spin_unlock_irqrestore+0x36/0x70
> [ 2458.894388] hardirqs last disabled at (48): [<ffffffff815e337b>] _raw_spin_lock_irqsave+0x2b/0xa0
> [ 2459.000771] softirqs last  enabled at (0): [<ffffffff81059247>] copy_process.part.24+0x627/0x15f0
> [ 2459.107161] softirqs last disabled at (0): [<          (null)>]           (null)
> [ 2459.195852]
> [ 2459.195852] other info that might help us debug this:
> [ 2459.274024]  Possible unsafe locking scenario:
> [ 2459.274024]
> [ 2459.344911]        CPU0
> [ 2459.374161]        ----
> [ 2459.403408]   lock(&sig->group_rwsem);
> [ 2459.448490]   <Interrupt>
> [ 2459.479825]     lock(&sig->group_rwsem);
> [ 2459.526979]
> [ 2459.526979]  *** DEADLOCK ***
> [ 2459.526979]
> [ 2459.597866] no locks held by kswapd2/1151.
> [ 2459.646896]
> [ 2459.646896] stack backtrace:
> [ 2459.699049] CPU: 30 PID: 1151 Comm: kswapd2 Not tainted 3.10.39+ #4
> [ 2459.774098] Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 01.48 05/07/2014
> [ 2459.895983]  ffffffff82284bf0 ffff88085856bbf8 ffffffff815dbcf6 ffff88085856bc48
> [ 2459.985003]  ffffffff815d67c6 0000000000000000 ffff880800000001 ffff880800000001
> [ 2460.074024]  000000000000000a ffff88085edc9600 ffffffff810be0e0 0000000000000009
> [ 2460.163087] Call Trace:
> [ 2460.192345]  [<ffffffff815dbcf6>] dump_stack+0x19/0x1b
> [ 2460.253874]  [<ffffffff815d67c6>] print_usage_bug+0x1f7/0x208
> [ 2460.399807]  [<ffffffff810bfb5d>] mark_lock+0x21d/0x2a0
> [ 2460.462369]  [<ffffffff810c076a>] __lock_acquire+0x52a/0xb60
> [ 2460.735516]  [<ffffffff810c1592>] lock_acquire+0xa2/0x140
> [ 2460.935691]  [<ffffffff815e01e1>] down_read+0x51/0xa0
> [ 2461.062888]  [<ffffffff81071864>] exit_signals+0x24/0x130
> [ 2461.127536]  [<ffffffff81060d55>] do_exit+0xb5/0xa50
> [ 2461.320433]  [<ffffffff8108303b>] kthread+0xdb/0x100
> [ 2461.532049]  [<ffffffff815ec0ec>] ret_from_fork+0x7c/0xb0

This is because the kswapd thread is still marked as a reclaimer at the
time of exit.  But because it is exiting, nobody is actually waiting on it
to make reclaim progress anymore, and it's nothing but a regular thread at
this point.  Be tidy and strip it of all its powers (PF_MEMALLOC,
PF_SWAPWRITE, PF_KSWAPD, and the lockdep reclaim state) before returning
from the thread function.

Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
Reported-by: Gu Zheng <guz.fnst@xxxxxxxxxxxxxx>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@xxxxxxxxxxxxxx>
Cc: Tang Chen <tangchen@xxxxxxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/vmscan.c |    3 +++
 1 file changed, 3 insertions(+)

diff -puN mm/vmscan.c~mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting mm/vmscan.c
--- a/mm/vmscan.c~mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting
+++ a/mm/vmscan.c
@@ -3419,7 +3419,10 @@ static int kswapd(void *p)
 		}
 	}
 
+	tsk->flags &= ~(PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD);
 	current->reclaim_state = NULL;
+	lockdep_clear_current_reclaim_state();
+
 	return 0;
 }
 
_

Patches currently in -mm which might be from hannes@xxxxxxxxxxx are

origin.patch
mm-vmscan-clear-kswapds-special-reclaim-powers-before-exiting.patch
pagewalk-update-page-table-walker-core.patch
pagewalk-add-walk_page_vma.patch
smaps-redefine-callback-functions-for-page-table-walker.patch
clear_refs-redefine-callback-functions-for-page-table-walker.patch
pagemap-redefine-callback-functions-for-page-table-walker.patch
numa_maps-redefine-callback-functions-for-page-table-walker.patch
memcg-redefine-callback-functions-for-page-table-walker.patch
arch-powerpc-mm-subpage-protc-use-walk_page_vma-instead-of-walk_page_range.patch
pagewalk-remove-argument-hmask-from-hugetlb_entry.patch
mempolicy-apply-page-table-walker-on-queue_pages_range.patch
linux-next.patch
memcg-mm-introduce-lowlimit-reclaim.patch
memcg-mm-introduce-lowlimit-reclaim-fix.patch
memcg-mm-introduce-lowlimit-reclaim-fix2patch.patch
memcg-allow-setting-low_limit.patch
memcg-doc-clarify-global-vs-limit-reclaims.patch
memcg-doc-clarify-global-vs-limit-reclaims-fix.patch
memcg-document-memorylow_limit_in_bytes.patch
vmscan-memcg-check-whether-the-low-limit-should-be-ignored.patch
vmscan-memcg-always-use-swappiness-of-the-reclaimed-memcg-swappiness-and-oom_control.patch
vmscan-memcg-always-use-swappiness-of-the-reclaimed-memcg-swappiness-and-o-om-control-fixpatch.patch
mm-introduce-kmemleak_update_trace.patch
lib-update-the-kmemleak-stack-trace-for-radix-tree-allocations.patch
mm-memcontrol-clean-up-memcg-zoneinfo-lookup.patch
mm-memcontrol-remove-unnecessary-memcg-argument-from-soft-limit-functions.patch
memcg-deprecate-memoryforce_empty-knob.patch
memcg-deprecate-memoryforce_empty-knob-fix.patch
debugging-keep-track-of-page-owners.patch

--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]