inconsistent lock state on v4.14.20-rt17

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Ever since 4.9 we've been chasing random kernel crashes which are
reproducible on RT in SMP on iMX6Q. It happens when the system is
stressed using hackbench, however, only when hackbench is used with
sockets, not when used with pipes.

Lately we've upgraded to v4.14.20-rt17, which doesn't solve the issue,
but instead locks up the kernel. After switching on some Lock-Debugging 
we've been able to catch a trace (see below). It would be great if
someone could have a look at it, or guide me in tracing down the root-
cause.

Thanks,
Henri

[18586.277233] ================================
[18586.277236] WARNING: inconsistent lock state
[18586.277245] 4.14.20-rt17-henri-1 #15 Tainted: G        W
[18586.277248] --------------------------------
[18586.277253] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[18586.277263] hackbench/18985 [HC0[0]:SC0[0]:HE1:SE1] takes:
[18586.277267]  (&rq->lock){?...}, at: [<c0992134>]
__schedule+0x128/0x6ac
[18586.277300] {IN-HARDIRQ-W} state was registered at:
[18586.277314]   lock_acquire+0x288/0x32c
[18586.277324]   _raw_spin_lock+0x48/0x58
[18586.277338]   scheduler_tick+0x40/0xb4
[18586.277349]   update_process_times+0x38/0x6c
[18586.277359]   tick_periodic+0x120/0x148
[18586.277366]   tick_handle_periodic+0x2c/0xa0
[18586.277378]   twd_handler+0x3c/0x48
[18586.277389]   handle_percpu_devid_irq+0x290/0x608
[18586.277395]   generic_handle_irq+0x28/0x38
[18586.277402]   __handle_domain_irq+0xd4/0xf0
[18586.277409]   gic_handle_irq+0x64/0xa8
[18586.277414]   __irq_svc+0x70/0xc4
[18586.277420]   lock_acquire+0x2a4/0x32c
[18586.277425]   lock_acquire+0x2a4/0x32c
[18586.277440]   down_write_nested+0x54/0x68
[18586.277453]   sget_userns+0x310/0x4f4
[18586.277465]   mount_pseudo_xattr+0x68/0x170
[18586.277477]   nsfs_mount+0x3c/0x50
[18586.277484]   mount_fs+0x24/0xa8
[18586.277490]   vfs_kern_mount+0x58/0x118
[18586.277496]   kern_mount_data+0x24/0x34
[18586.277507]   nsfs_init+0x20/0x58
[18586.277522]   start_kernel+0x2f8/0x360
[18586.277528]   0x1000807c
[18586.277532] irq event stamp: 19441
[18586.277542] hardirqs last  enabled at (19441): [<c099665c>]
_raw_spin_unlock_irqrestore+0x88/0x90
[18586.277550] hardirqs last disabled at (19440): [<c09962f8>]
_raw_spin_lock_irqsave+0x2c/0x68
[18586.277562] softirqs last  enabled at (0): [<c0120c18>]
copy_process.part.5+0x370/0x1a54
[18586.277568] softirqs last disabled at (0): [<  (null)>]   (null)
[18586.277571]
               other info that might help us debug this:
[18586.277574]  Possible unsafe locking scenario:

[18586.277576]        CPU0
[18586.277578]        ----
[18586.277580]   lock(&rq->lock);
[18586.277587]   <Interrupt>
[18586.277588]     lock(&rq->lock);
[18586.277594]
                *** DEADLOCK ***

[18586.277599] 2 locks held by hackbench/18985:
[18586.277601]  #0:  (&u->iolock){+.+.}, at: [<c081de30>]
unix_stream_read_generic+0xb0/0x7e4
[18586.277624]  #1:  (rcu_read_lock){....}, at: [<c081b73c>]
unix_write_space+0x0/0x2b0
[18586.277640]
               stack backtrace:
[18586.277651] CPU: 1 PID: 18985 Comm: hackbench Tainted:
G        W       4.14.20-rt17-henri-1 #15
[18586.277654] Hardware name: Freescale i.MX6 Quad/DualLite (Device
Tree)
[18586.277683] [<c0111600>] (unwind_backtrace) from [<c010bfe8>]
(show_stack+0x20/0x24)
[18586.277701] [<c010bfe8>] (show_stack) from [<c097d79c>]
(dump_stack+0x9c/0xd0)
[18586.277714] [<c097d79c>] (dump_stack) from [<c0175424>]
(print_usage_bug+0x1c8/0x2d0)
[18586.277725] [<c0175424>] (print_usage_bug) from [<c0175970>]
(mark_lock+0x444/0x69c)
[18586.277736] [<c0175970>] (mark_lock) from [<c0177114>]
(__lock_acquire+0x23c/0x172c)
[18586.277748] [<c0177114>] (__lock_acquire) from [<c017935c>]
(lock_acquire+0x288/0x32c)
[18586.277759] [<c017935c>] (lock_acquire) from [<c0996150>]
(_raw_spin_lock+0x48/0x58)
[18586.277774] [<c0996150>] (_raw_spin_lock) from [<c0992134>]
(__schedule+0x128/0x6ac)
[18586.277789] [<c0992134>] (__schedule) from [<c09929c0>]
(preempt_schedule_irq+0x5c/0x8c)
[18586.277801] [<c09929c0>] (preempt_schedule_irq) from [<c010cc8c>]
(svc_preempt+0x8/0x2c)
[18586.277815] [<c010cc8c>] (svc_preempt) from [<c0190b60>]
(__rcu_read_unlock+0x40/0x98)
[18586.277829] [<c0190b60>] (__rcu_read_unlock) from [<c081b9a4>]
(unix_write_space+0x268/0x2b0)
[18586.277847] [<c081b9a4>] (unix_write_space) from [<c07643d8>]
(sock_wfree+0x70/0xac)
[18586.277860] [<c07643d8>] (sock_wfree) from [<c081aff0>]
(unix_destruct_scm+0x74/0x7c)
[18586.277876] [<c081aff0>] (unix_destruct_scm) from [<c076a8dc>]
(skb_release_head_state+0x78/0x80)
[18586.277891] [<c076a8dc>] (skb_release_head_state) from [<c076ac28>]
(skb_release_all+0x1c/0x34)
[18586.277905] [<c076ac28>] (skb_release_all) from [<c076ac5c>]
(__kfree_skb+0x1c/0x28)
[18586.277919] [<c076ac5c>] (__kfree_skb) from [<c076b470>]
(consume_skb+0x228/0x2b4)
[18586.277933] [<c076b470>] (consume_skb) from [<c081e3d4>]
(unix_stream_read_generic+0x654/0x7e4)
[18586.277947] [<c081e3d4>] (unix_stream_read_generic) from
[<c081e65c>] (unix_stream_recvmsg+0x5c/0x68)
[18586.277969] [<c081e65c>] (unix_stream_recvmsg) from [<c075f0e0>]
(sock_recvmsg+0x28/0x2c)
[18586.277983] [<c075f0e0>] (sock_recvmsg) from [<c075f174>]
(sock_read_iter+0x90/0xb8)
[18586.277998] [<c075f174>] (sock_read_iter) from [<c02559ec>]
(__vfs_read+0x108/0x12c)
[18586.278010] [<c02559ec>] (__vfs_read) from [<c0255ab0>]
(vfs_read+0xa0/0x10c)
[18586.278021] [<c0255ab0>] (vfs_read) from [<c0255f4c>]
(SyS_read+0x50/0x88)
[18586.278035] [<c0255f4c>] (SyS_read) from [<c01074e0>]
(ret_fast_syscall+0x0/0x28)��.n��������+%������w��{.n�����{�����ǫ���ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux