+ mm-implement-swap-prefetching-fix.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled

     swap prefetch fix

has been added to the -mm tree.  Its filename is

     mm-implement-swap-prefetching-fix.patch

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this


From: Ingo Molnar <mingo@xxxxxxx>

fix potential swap-prefetch deadlock: swapped.lock must be taken in an
irq-safe manner, because it can be taken while holding an irq-safe lock in
the following code sequence:

 [<c014646e>] lockdep_acquire+0x69/0x82
 [<c10c4fc1>] _spin_lock+0x21/0x2f
 [<c0175e7d>] add_to_swapped_list+0x1f/0x13b
 [<c016780d>] remove_mapping+0x84/0xc3
 [<c0167c47>] shrink_inactive_list+0x3fb/0x705
 [<c016800a>] shrink_zone+0xb9/0xd8
 [<c01682bc>] kswapd+0x293/0x38a
 [<c013f271>] kthread+0xa6/0xd3

found by the locking correctness validator. The full validator
output is:

======================================================
[ BUG: hard-safe -> hard-unsafe lock order detected! ]
------------------------------------------------------
kswapd0/1173 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 (swapped.lock){--..}, at: [<c0175e7d>] add_to_swapped_list+0x1f/0x13b

and this task is already holding:
 (swapper_space.tree_lock){+...}, at: [<c01677a5>] remove_mapping+0x1c/0xc3
which would create a new lock dependency:
 (swapper_space.tree_lock){+...} -> (swapped.lock){--..}

but this new dependency connects a hard-irq-safe lock:
 (swapper_space.tree_lock){+...}
... which became hard-irq-safe at:
  [<c014646e>] lockdep_acquire+0x69/0x82
  [<c10c52a2>] _write_lock_irqsave+0x2a/0x3a
  [<c0164c6f>] test_clear_page_writeback+0x48/0xbc
  [<c0165a57>] rotate_reclaimable_page+0x8b/0xb4
  [<c015e2c3>] end_page_writeback+0x18/0x45
  [<c0173153>] end_swap_bio_write+0x28/0x36
  [<c018549a>] bio_endio+0x66/0x6e
  [<c0532613>] __end_that_request_first+0x23c/0x559
  [<c0532945>] end_that_request_first+0xb/0xd
  [<c09cae20>] ide_end_request+0xa8/0xf1
  [<c09d1c44>] ide_dma_intr+0x54/0x8f
  [<c09cc011>] ide_intr+0x147/0x1a7
  [<c015adc7>] handle_IRQ_event+0x1f/0x4f
  [<c015ae9d>] __do_IRQ+0xa6/0x111
  [<c0106287>] do_IRQ+0x8c/0xad

to a hard-irq-unsafe lock:
 (swapped.lock){--..}
... which became hard-irq-unsafe at:
...  [<c014646e>] lockdep_acquire+0x69/0x82
  [<c10c4fc1>] _spin_lock+0x21/0x2f
  [<c01763fc>] kprefetchd+0x3eb/0x40b
  [<c013f271>] kthread+0xa6/0xd3
  [<c0102005>] kernel_thread_helper+0x5/0xb

which could potentially lead to deadlocks!

other info that might help us debug this:

1 locks held by kswapd0/1173:
 #0:  (swapper_space.tree_lock){+...}, at: [<c01677a5>] remove_mapping+0x1c/0xc3

the hard-irq-safe lock's dependencies:
-> (swapper_space.tree_lock){+...} ops: 121 {
   initial-use  at:
                        [<c014646e>] lockdep_acquire+0x69/0x82
                        [<c10c505e>] _write_lock_irq+0x27/0x35
                        [<c017341a>] __add_to_swap_cache+0x65/0x149
                        [<c0173512>] move_to_swap_cache+0x14/0x62
                        [<c0178af8>] shmem_writepage+0x10c/0x1b5
                        [<c01672e9>] pageout+0x119/0x1cc
                        [<c0167b73>] shrink_inactive_list+0x327/0x705
                        [<c016800a>] shrink_zone+0xb9/0xd8
                        [<c01684f5>] try_to_free_pages+0x142/0x221
                        [<c0163e36>] __alloc_pages+0x1d9/0x325
                        [<c015f919>] generic_file_buffered_write+0x189/0x5cd
                        [<c0160f8d>] __generic_file_aio_write_nolock+0x450/0x48d
                        [<c01611fa>] generic_file_aio_write+0x6e/0xc1
                        [<c02ab0c2>] ext3_file_write+0x1a/0x8b
                        [<c017fe22>] do_sync_write+0xb1/0xe6
                        [<c01807da>] vfs_write+0xcd/0x176
                        [<c0180e7d>] sys_write+0x3b/0x71
                        [<c10c5abb>] syscall_call+0x7/0xb
   in-hardirq-W at:
                        [<c014646e>] lockdep_acquire+0x69/0x82
                        [<c10c52a2>] _write_lock_irqsave+0x2a/0x3a
                        [<c0164c6f>] test_clear_page_writeback+0x48/0xbc
                        [<c0165a57>] rotate_reclaimable_page+0x8b/0xb4
                        [<c015e2c3>] end_page_writeback+0x18/0x45
                        [<c0173153>] end_swap_bio_write+0x28/0x36
                        [<c018549a>] bio_endio+0x66/0x6e
                        [<c0532613>] __end_that_request_first+0x23c/0x559
                        [<c0532945>] end_that_request_first+0xb/0xd
                        [<c09cae20>] ide_end_request+0xa8/0xf1
                        [<c09d1c44>] ide_dma_intr+0x54/0x8f
                        [<c09cc011>] ide_intr+0x147/0x1a7
                        [<c015adc7>] handle_IRQ_event+0x1f/0x4f
                        [<c015ae9d>] __do_IRQ+0xa6/0x111
                        [<c0106287>] do_IRQ+0x8c/0xad
 }
 ... key      at: [<c14d4e64>] swapper_space+0x24/0xfc

the hard-irq-unsafe lock's dependencies:
-> (swapped.lock){--..} ops: 2 {
   initial-use  at:
                        [<c014646e>] lockdep_acquire+0x69/0x82
                        [<c10c4fc1>] _spin_lock+0x21/0x2f
                        [<c01763fc>] kprefetchd+0x3eb/0x40b
                        [<c013f271>] kthread+0xa6/0xd3
                        [<c0102005>] kernel_thread_helper+0x5/0xb
   softirq-on-W at:
                        [<c014646e>] lockdep_acquire+0x69/0x82
                        [<c10c4fc1>] _spin_lock+0x21/0x2f
                        [<c01763fc>] kprefetchd+0x3eb/0x40b
                        [<c013f271>] kthread+0xa6/0xd3
                        [<c0102005>] kernel_thread_helper+0x5/0xb
   hardirq-on-W at:
                        [<c014646e>] lockdep_acquire+0x69/0x82
                        [<c10c4fc1>] _spin_lock+0x21/0x2f
                        [<c01763fc>] kprefetchd+0x3eb/0x40b
                        [<c013f271>] kthread+0xa6/0xd3
                        [<c0102005>] kernel_thread_helper+0x5/0xb
 }
 ... key      at: [<c14d5658>] swapped+0x18/0x60

stack backtrace:
 [<c0104e7f>] show_trace+0xd/0xf
 [<c0104e96>] dump_stack+0x15/0x17
 [<c0144cd3>] check_usage+0x1f4/0x201
 [<c0146238>] __lockdep_acquire+0x873/0xa40
 [<c014646e>] lockdep_acquire+0x69/0x82
 [<c10c4fc1>] _spin_lock+0x21/0x2f
 [<c0175e7d>] add_to_swapped_list+0x1f/0x13b
 [<c016780d>] remove_mapping+0x84/0xc3
 [<c0167c47>] shrink_inactive_list+0x3fb/0x705
 [<c016800a>] shrink_zone+0xb9/0xd8
 [<c01682bc>] kswapd+0x293/0x38a
 [<c013f271>] kthread+0xa6/0xd3
 [<c0102005>] kernel_thread_helper+0x5/0xb

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
Cc: Con Kolivas <kernel@xxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

 mm/swap_prefetch.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff -puN mm/swap_prefetch.c~mm-implement-swap-prefetching-fix mm/swap_prefetch.c
--- devel/mm/swap_prefetch.c~mm-implement-swap-prefetching-fix	2006-05-09 08:42:41.000000000 -0700
+++ devel-akpm/mm/swap_prefetch.c	2006-05-09 08:42:41.000000000 -0700
@@ -65,7 +65,7 @@ inline void delay_swap_prefetch(void)
 void add_to_swapped_list(struct page *page)
 {
 	struct swapped_entry *entry;
-	unsigned long index;
+	unsigned long index, flags;
 	int wakeup;
 
 	if (!swap_prefetch)
@@ -73,7 +73,7 @@ void add_to_swapped_list(struct page *pa
 
 	wakeup = 0;
 
-	spin_lock(&swapped.lock);
+	spin_lock_irqsave(&swapped.lock, flags);
 	if (swapped.count >= swapped.maxcount) {
 		/*
 		 * We limit the number of entries to 2/3 of physical ram.
@@ -112,7 +112,7 @@ void add_to_swapped_list(struct page *pa
 	}
 
 out_locked:
-	spin_unlock(&swapped.lock);
+	spin_unlock_irqrestore(&swapped.lock, flags);
 
 	/* Do the wakeup outside the lock to shorten lock hold time. */
 	if (wakeup)
@@ -433,6 +433,7 @@ static enum trickle_return trickle_swap(
 {
 	enum trickle_return ret = TRICKLE_DELAY;
 	struct swapped_entry *entry;
+	unsigned long flags;
 
 	/*
 	 * If laptop_mode is enabled don't prefetch to avoid hard drives
@@ -451,10 +452,10 @@ static enum trickle_return trickle_swap(
 		if (!prefetch_suitable())
 			break;
 
-		spin_lock(&swapped.lock);
+		spin_lock_irqsave(&swapped.lock, flags);
 		if (list_empty(&swapped.list)) {
 			ret = TRICKLE_FAILED;
-			spin_unlock(&swapped.lock);
+			spin_unlock_irqrestore(&swapped.lock, flags);
 			break;
 		}
 
@@ -473,7 +474,7 @@ static enum trickle_return trickle_swap(
 			 * be a reason we could not swap them back in so
 			 * delay attempting further prefetching.
 			 */
-			spin_unlock(&swapped.lock);
+			spin_unlock_irqrestore(&swapped.lock, flags);
 			break;
 		}
 
@@ -484,12 +485,12 @@ static enum trickle_return trickle_swap(
 			 * not suitable for prefetching so skip it.
 			 */
 			entry = prev_swapped_entry(entry);
-			spin_unlock(&swapped.lock);
+			spin_unlock_irqrestore(&swapped.lock, flags);
 			continue;
 		}
 		swp_entry = entry->swp_entry;
 		entry = prev_swapped_entry(entry);
-		spin_unlock(&swapped.lock);
+		spin_unlock_irqrestore(&swapped.lock, flags);
 
 		if (trickle_swap_cache_async(swp_entry, node) == TRICKLE_DELAY)
 			break;
_

Patches currently in -mm which might be from mingo@xxxxxxx are

origin.patch
sem2mutex-drivers-acpi.patch
sem2mutex-acpi-acpi_link_lock.patch
git-dvb.patch
sem2mutex-drivers-ieee1394.patch
git-netdev-all.patch
git-serial.patch
fix-for-serial-uart-lockup.patch
qla2xxx-lock-ordering-fix.patch
x86_64-mm-serialize-assign_irq_vector-use-of-static-variables-fix.patch
swapless-pm-add-r-w-migration-entries-fix.patch
i386-break-out-of-recursion-in-stackframe-walk.patch
work-around-ppc64-bootup-bug-by-making-mutex-debugging-save-restore-irqs.patch
kernel-kernel-cpuc-to-mutexes.patch
cond-resched-might-sleep-fix.patch
time-clocksource-infrastructure.patch
sched-implement-smpnice.patch
sched-protect-calculation-of-max_pull-from-integer-wrap.patch
sched-store-weighted-load-on-up.patch
sched-add-discrete-weighted-cpu-load-function.patch
sched-prevent-high-load-weight-tasks-suppressing-balancing.patch
sched-improve-stability-of-smpnice-load-balancing.patch
sched-improve-smpnice-load-balancing-when-load-per-task.patch
smpnice-dont-consider-sched-groups-which-are-lightly-loaded-for-balancing.patch
smpnice-dont-consider-sched-groups-which-are-lightly-loaded-for-balancing-fix.patch
sched-modify-move_tasks-to-improve-load-balancing-outcomes.patch
sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks.patch
sched-avoid-unnecessarily-moving-highest-priority-task-move_tasks-fix-2.patch
sched_domain-handle-kmalloc-failure.patch
sched_domain-handle-kmalloc-failure-fix.patch
sched_domain-dont-use-gfp_atomic.patch
sched_domain-use-kmalloc_node.patch
sched_domain-allocate-sched_group-structures-dynamically.patch
sched-add-above-background-load-function.patch
mm-implement-swap-prefetching-fix.patch
pi-futex-futex-code-cleanups.patch
pi-futex-futex-code-cleanups-fix.patch
pi-futex-introduce-debug_check_no_locks_freed.patch
pi-futex-add-plist-implementation.patch
pi-futex-scheduler-support-for-pi.patch
pi-futex-rt-mutex-core.patch
pi-futex-rt-mutex-core-fix-timeout-race.patch
pi-futex-rt-mutex-docs.patch
pi-futex-rt-mutex-debug.patch
pi-futex-rt-mutex-tester.patch
pi-futex-rt-mutex-futex-api.patch
pi-futex-futex_lock_pi-futex_unlock_pi-support.patch
pi-futex-v2.patch
pi-futex-v3.patch
pi-futex-patchset-v4.patch
pi-futex-patchset-v4-update.patch
rtmutex-remove-buggy-bug_on-in-pi-boosting-code.patch
futex-pi-enforce-waiter-bit-when-owner-died-is-detected.patch
rtmutex-debug-printk-correct-task-information.patch
futex-pi-make-use-of-restart_block-when-interrupted.patch
reiser4.patch
detect-atomic-counter-underflows.patch
debug-shared-irqs.patch
make-frame_pointer-default=y.patch
mutex-subsystem-synchro-test-module.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux