Re: What does inconsistent lock state mean?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, Dec 8, 2011 at 11:50 PM, Kai Meyer <kai@xxxxxxxxxx> wrote:
On 12/08/2011 07:47 AM, Srivatsa Bhat wrote:
2 things:
1. Documentation/lockdep-design.txt explains the "cryptic lock state symbols".
2. Please post the lockdep splat _exactly_ as it appears, and _in full_ 
    (and without line-wrapping, if possible). Usually lockdep is intelligent
    enough to tell you the possible scenario that would lock up your system.
    That gives a very good clue, in case you find it difficult to make out what
     is wrong from the cryptic symbols.

Regards,
Srivatsa S. Bhat




_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Oh, sorry. I suppose I only included things that made any sense to me. If I were to hazard a guess after reading through the design doc, it sounds like there's a problem with the context in which locks are being acquired? That seems odd to me, since I don't get the inconsistent lock state until I'm trying to spin_unlock
&sblsnap_snapshot_table[i].sblsnap_lock (which is why I assume it's listed as one that's currently held.


Dec  7 15:52:20 dev2 kernel: =================================
Dec  7 15:52:20 dev2 kernel: [ INFO: inconsistent lock state ]
Dec  7 15:52:20 dev2 kernel: 2.6.32-220.el6.x86_64.debug #1
Dec  7 15:52:20 dev2 kernel: ---------------------------------
Dec  7 15:52:20 dev2 kernel: inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
Dec  7 15:52:20 dev2 kernel: tee/1966 [HC0[0]:SC0[0]:HE1:SE1] takes:
Dec  7 15:52:20 dev2 kernel: (&vblk->lock){?.-...}, at: [<ffffffffa04c4d2a>] sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
Dec  7 15:52:20 dev2 kernel: {IN-HARDIRQ-W} state was registered at:
Dec  7 15:52:20 dev2 kernel:  [<ffffffff810afbca>] __lock_acquire+0x77a/0x1570
Dec  7 15:52:20 dev2 kernel:  [<ffffffff810b0a64>] lock_acquire+0xa4/0x120
Dec  7 15:52:20 dev2 kernel:  [<ffffffff81520c75>] _spin_lock_irqsave+0x55/0xa0
Dec  7 15:52:20 dev2 kernel:  [<ffffffffa006b19b>] blk_done+0x2b/0x110 [virtio_blk]
Dec  7 15:52:20 dev2 kernel:  [<ffffffffa00401dc>] vring_interrupt+0x3c/0xd0 [virtio_ring]
Dec  7 15:52:20 dev2 kernel:  [<ffffffff810ec080>] handle_IRQ_event+0x50/0x160
Dec  7 15:52:20 dev2 kernel:  [<ffffffff810ee840>] handle_edge_irq+0xe0/0x170
Dec  7 15:52:20 dev2 kernel:  [<ffffffff8100e059>] handle_irq+0x49/0xa0
Dec  7 15:52:20 dev2 kernel:  [<ffffffff81526cdc>] do_IRQ+0x6c/0xf0
Dec  7 15:52:20 dev2 kernel:  [<ffffffff8100ba93>] ret_from_intr+0x0/0x16
Dec  7 15:52:20 dev2 kernel:  [<ffffffff810148e2>] default_idle+0x52/0xc0
Dec  7 15:52:20 dev2 kernel:  [<ffffffff81009e0b>] cpu_idle+0xbb/0x110
Dec  7 15:52:20 dev2 kernel:  [<ffffffff81516623>] start_secondary+0x211/0x254
Dec  7 15:52:20 dev2 kernel: irq event stamp: 4699
Dec  7 15:52:20 dev2 kernel: hardirqs last  enabled at (4699): [<ffffffff81179e81>] __kmalloc+0x241/0x330
Dec  7 15:52:20 dev2 kernel: hardirqs last disabled at (4698): [<ffffffff81179d60>] __kmalloc+0x120/0x330
Dec  7 15:52:20 dev2 kernel: softirqs last  enabled at (4696): [<ffffffff81076baa>] __do_softirq+0x14a/0x200
Dec  7 15:52:20 dev2 kernel: softirqs last disabled at (4681): [<ffffffff8100c30c>] call_softirq+0x1c/0x30
Dec  7 15:52:20 dev2 kernel:
Dec  7 15:52:20 dev2 kernel: other info that might help us debug this:
Dec  7 15:52:20 dev2 kernel: 1 lock held by tee/1966:
Dec  7 15:52:20 dev2 kernel: #0:  (&sblsnap_snapshot_table[i].sblsnap_lock){+.+.+.}, at: [<ffffffffa04c4b7c>] sblsnap_snap_now+0xac/0x2a0 [sblsnap]
Dec  7 15:52:20 dev2 kernel:
Dec  7 15:52:20 dev2 kernel: stack backtrace:
Dec  7 15:52:20 dev2 kernel: Pid: 1966, comm: tee Not tainted 2.6.32-220.el6.x86_64.debug #1
Dec  7 15:52:20 dev2 kernel: Call Trace:
Dec  7 15:52:20 dev2 kernel: [<ffffffff810ad947>] ? print_usage_bug+0x177/0x180
Dec  7 15:52:20 dev2 kernel: [<ffffffff810ae8ed>] ? mark_lock+0x35d/0x430
Dec  7 15:52:20 dev2 kernel: [<ffffffff810afa59>] ? __lock_acquire+0x609/0x1570
Dec  7 15:52:20 dev2 kernel: [<ffffffff810ab31d>] ? trace_hardirqs_off+0xd/0x10
Dec  7 15:52:20 dev2 kernel: [<ffffffff815208e7>] ? _spin_unlock_irqrestore+0x67/0x80
Dec  7 15:52:20 dev2 kernel: [<ffffffff8106ec43>] ? release_console_sem+0x203/0x250
Dec  7 15:52:20 dev2 kernel: [<ffffffff810b0a64>] ? lock_acquire+0xa4/0x120
Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4d2a>] ? sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
Dec  7 15:52:20 dev2 kernel: [<ffffffff81520af6>] ? _spin_lock+0x36/0x70
Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4d2a>] ? sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4d2a>] ? sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4de4>] ? sblsnap_patch_blkdev+0x74/0x120 [sblsnap]
Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4eaf>] ? sblsnap_get_snapshot+0x1f/0x60 [sblsnap]
Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4f59>] ? sblsnap_create_snapshot+0x69/0x120 [sblsnap]
Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c53cb>] ? sblsnap_config_write+0x26b/0x2c0 [sblsnap]
Dec  7 15:52:20 dev2 kernel: [<ffffffff81200343>] ? proc_file_write+0x73/0xb0
Dec  7 15:52:20 dev2 kernel: [<ffffffff812002d0>] ? proc_file_write+0x0/0xb0
Dec  7 15:52:20 dev2 kernel: [<ffffffff811f97c5>] ? proc_reg_write+0x85/0xc0
Dec  7 15:52:20 dev2 kernel: [<ffffffff81193318>] ? vfs_write+0xb8/0x1a0
Dec  7 15:52:20 dev2 kernel: [<ffffffff810e68b2>] ? audit_syscall_entry+0x272/0x2a0
Dec  7 15:52:20 dev2 kernel: [<ffffffff81193d21>] ? sys_write+0x51/0x90
Dec  7 15:52:20 dev2 kernel: [<ffffffff8100b0b2>] ? system_call_fastpath+0x16/0x1b


The line print_usage_bug_scenario(this); seems to be missing in your kernel's print_usage_bug()
function in kernel/lockdep.c If it was there (it is there in newer kernels), it would have showed
pictorially what is the problem, and then most likely you wouldn't have to look at anything else
to figure out what's the problem! That's the reason I asked for full output, but alas your lockdep
seems to be old.

Though, what exactly is the problem is not immediately apparent to me, I would certainly 
suggest using spin_lock_irqsave() and spin_unlock_irqrestore() functions, instead of just using
the spin_lock() and spin_unlock() versions, whenever you are dealing with locks that have the
possibility of being acquired by interrupt contexts (i.e., while servicing interrupts).

Regards,
Srivatsa S. Bhat


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

[Index of Archives]     [Newbies FAQ]     [Linux Kernel Mentors]     [Linux Kernel Development]     [IETF Annouce]     [Git]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux SCSI]     [Linux ACPI]
  Powered by Linux