mm: swap: hang in lru_add_drain_all

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

While fuzzing with trinity inside a KVM tools guest running the latest -next
kernel I've stumbled on the following spew:

[  729.682257] INFO: task trinity-c158:13508 blocked for more than 120 seconds.
[  729.683191]       Not tainted 3.16.0-rc5-next-20140718-sasha-00052-g4d34feb #902
[  729.683843] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  729.684633] trinity-c158    D ffff880d32169000 12520 13508   8729 0x00000000
[  729.685323]  ffff880de6d3fe38 0000000000000002 0000000000000006 ffff880c6c33b000
[  729.686039]  000000785eb36cd1 ffff880de6d3c010 ffff880de6d3c000 ffff880e2b270000
[  729.686761]  ffff880c6c33b000 0000000000000000 ffffffffb070b908 0000000026e426e2
[  729.687483] Call Trace:
[  729.687736] schedule_preempt_disabled (kernel/sched/core.c:2874)
[  729.688544] mutex_lock_nested (kernel/locking/mutex.c:532 kernel/locking/mutex.c:584)
[  729.689127] ? lru_add_drain_all (mm/swap.c:867)
[  729.689666] ? lru_add_drain_all (mm/swap.c:867)
[  729.690334] lru_add_drain_all (mm/swap.c:867)
[  729.690946] SyS_mlockall (./arch/x86/include/asm/current.h:14 include/linux/sched.h:2978 mm/mlock.c:813 mm/mlock.c:798)
[  729.691437] tracesys (arch/x86/kernel/entry_64.S:541)
[  729.691883] 1 lock held by trinity-c158/13508:
[  729.692333] #0: (lock#3){+.+...}, at: lru_add_drain_all (mm/swap.c:867)

There are quite a few tasks with the same stacktrace. Since the lock we're blocking
on is static local to the function it's easy to figure out which task actually has it:

[  739.587839] trinity-c135    D ffff880e1318c000 13096 21051   8729 0x00000000
[  739.589080]  ffff880bf659fcc0 0000000000000002 ffff880c74223cf0 ffff880c74223000
[  739.590544]  000000781c3378ed ffff880bf659c010 ffff880bf659c000 ffff880e2b153000
[  739.591815]  ffff880c74223000 0000000000000000 7fffffffffffffff ffff880bf659fe80
[  739.593165] Call Trace:
[  739.593588] schedule (kernel/sched/core.c:2847)
[  739.594396] schedule_timeout (kernel/time/timer.c:1476)
[  739.595354] ? mark_lock (kernel/locking/lockdep.c:2894)
[  739.596229] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[  739.597297] ? get_parent_ip (kernel/sched/core.c:2561)
[  739.598185] wait_for_completion (include/linux/spinlock.h:328 kernel/sched/completion.c:76 kernel/sched/completion.c:93 kernel/sched/completion.c:101 kernel/sched/completion.c:122)
[  739.599191] ? wake_up_state (kernel/sched/core.c:2942)
[  739.600246] flush_work (kernel/workqueue.c:503 kernel/workqueue.c:2762)
[  739.601171] ? flush_work (kernel/workqueue.c:2733 kernel/workqueue.c:2760)
[  739.602084] ? destroy_worker (kernel/workqueue.c:2348)
[  739.603035] ? wait_for_completion (kernel/sched/completion.c:64 kernel/sched/completion.c:93 kernel/sched/completion.c:101 kernel/sched/completion.c:122)
[  739.604043] lru_add_drain_all (include/linux/cpumask.h:180 mm/swap.c:883)
[  739.605076] SyS_mlockall (./arch/x86/include/asm/current.h:14 include/linux/sched.h:2978 mm/mlock.c:813 mm/mlock.c:798)
[  739.605943] tracesys (arch/x86/kernel/entry_64.S:541)

Now, you'd expect to see lru_add_drain_per_cpu in one of the tasks, but
that's not the case.

Full log attched.


Thanks,
Sasha

Attachment: log.txt.xz
Description: application/xz


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]