Re: lockdep warning in lpfc v5.9-rc4

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/11/2020 3:49 PM, James Smart wrote:
On 9/11/2020 1:34 AM, Daniel Wagner wrote:
Hi,

I just hit a lockdep warning in lpfc. Not sure if it is a valid complain
or not:

  ================================
  WARNING: inconsistent lock state
  5.9.0-rc4-default #80 Tainted: G            E
  --------------------------------
  inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
  kworker/2:2/264 [HC0[0]:SC0[0]:HE1:SE1] takes:
  ffff9a726e7cd668 (&lpfc_ncmd->buf_lock){+.?.}-{2:2}, at: lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
  {IN-SOFTIRQ-W} state was registered at:
    lock_acquire+0xb2/0x3a0
    _raw_spin_lock+0x30/0x70
    lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
    lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
    lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
    __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
    __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
    lpfc_cq_poll_hdler+0x16/0x20 [lpfc]
    irq_poll_softirq+0x96/0x150
    __do_softirq+0xd3/0x47b
    asm_call_on_stack+0x12/0x20
    do_softirq_own_stack+0x52/0x60
    irq_exit_rcu+0xea/0xf0
    common_interrupt+0xa9/0x1a0
    asm_common_interrupt+0x1e/0x40
    refresh_cpu_vm_stats+0x20c/0x2a0
    vmstat_update+0xf/0x50
    process_one_work+0x2b7/0x640
    worker_thread+0x39/0x3f0
    kthread+0x139/0x150
    ret_from_fork+0x22/0x30
  irq event stamp: 2621
  hardirqs last  enabled at (2621): [<ffffffff91ff525d>] _raw_spin_unlock_irqrestore+0x2d/0x50   hardirqs last disabled at (2620): [<ffffffff91ff5a38>] _raw_spin_lock_irqsave+0x88/0x8a   softirqs last  enabled at (1420): [<ffffffff92200351>] __do_softirq+0x351/0x47b   softirqs last disabled at (1399): [<ffffffff92001032>] asm_call_on_stack+0x12/0x20
    other info that might help us debug this:
   Possible unsafe locking scenario:
           CPU0
         ----
    lock(&lpfc_ncmd->buf_lock);
    <Interrupt>
      lock(&lpfc_ncmd->buf_lock);
     *** DEADLOCK ***
    2 locks held by kworker/2:2/264:
   #0: ffff9a727ccd2d48 ((wq_completion)lpfc_wq#4){+.+.}-{0:0}, at: process_one_work+0x237/0x640    #1: ffffb73dc0d37e68 ((work_completion)(&queue->irqwork)){+.+.}-{0:0}, at: process_one_work+0x237/0x640
    stack backtrace:
  CPU: 2 PID: 264 Comm: kworker/2:2 Tainted: G            E 5.9.0-rc4-default #80   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014
  Workqueue: lpfc_wq lpfc_sli4_hba_process_cq [lpfc]
  Call Trace:
   dump_stack+0x8d/0xbb
   mark_lock+0x5e5/0x690
   ? print_shortest_lock_dependencies+0x180/0x180
   __lock_acquire+0x2d5/0xbf0
   lock_acquire+0xb2/0x3a0
   ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
   ? lock_acquire+0xb2/0x3a0
   _raw_spin_lock+0x30/0x70
   ? lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
   lpfc_scsi_cmd_iocb_cmpl+0x49/0xae0 [lpfc]
   lpfc_sli4_fp_handle_fcp_wcqe.isra.29+0xfb/0x390 [lpfc]
   ? ret_from_fork+0x22/0x30
   ? unwind_next_frame+0x1fc/0x640
   ? create_prof_cpu_mask+0x20/0x20
   ? arch_stack_walk+0x8f/0xf0
   ? ret_from_fork+0x22/0x30
   ? lpfc_handle_fcp_err+0xb00/0xb00 [lpfc]
   ? lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
   lpfc_sli4_fp_handle_cqe+0x172/0x490 [lpfc]
   __lpfc_sli4_process_cq+0xfd/0x270 [lpfc]
   ? lpfc_sli4_sp_handle_abort_xri_wcqe.isra.54+0x170/0x170 [lpfc]
   __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc]
   process_one_work+0x2b7/0x640
   ? find_held_lock+0x34/0xa0
   ? process_one_work+0x640/0x640
   worker_thread+0x39/0x3f0
   ? process_one_work+0x640/0x640
   kthread+0x139/0x150
   ? kthread_park+0x90/0x90
   ret_from_fork+0x22/0x30


Thanks,
Daniel


It's likely valid, but rare. Most of the time, only one of these 2 flows are occuring. We'll look into it.

-- james

Looking a little futher - it doesn't look like this can be hit. Although both the softirq and work element could be running, both call the same routine and only 1 callee will succeed in a cmpxchg() for the queue. As  such, only one entity will process the queue and there would not be a way for both to be looking at the same io completion.

-- james




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux