locking bug in __iscsi_conn_send_pdu

Christopher Unkel <cunkel@xxxxxxxxxxxxxx> · Fri, 11 May 2018 17:01:59 -0700

Hello All,

I've been trying to track down the cause of some soft lockups with
call traces involving the iscsi initiator.  I believe I have found a
locking error in __iscsi_conn_send_pdu that is responsible.

A previous commit (4fa507992f0a ("scsi: libiscsi: Fix locking in
__iscsi_conn_send_pdu") changed the way that __iscsi_conn_send_pdu
acquires session->back_lock, from using spin_lock_bh to using
spin_lock.  However, half a dozen lines up the function is still using
spin_lock_bh to acquire conn->taskqueuelock.  I think the locking for
conn->taskqueuelock should be changed in the same way the locking for
conn->session->back_lock was changed.  A patch to do so follows.

Furthermore, I think that the message for the previous commit
understates the impact of the improper use of spin_lock_bh.  It states
that the use of spin_lock_bh here can trigger a WARN_ONCE.  I believe
it creates the possibility of deadlocks involving session->frwd_lock
and/or queue_lock, and that I have observed such a deadlock.

Here is the call trace from one soft lockup I have seen:

    Call Trace:
     <IRQ>
     do_raw_spin_lock+0xb2/0xc0
     _raw_spin_lock+0x55/0x70
     iscsi_check_transport_timeouts+0x24/0x1a0 [libiscsi]
     call_timer_fn+0xb9/0x3a0
     ? iscsi_complete_pdu+0x60/0x60 [libiscsi]
     run_timer_softirq+0x254/0x650
     __do_softirq+0xc5/0x4eb
     ? __iscsi_conn_send_pdu+0x1e3/0x420 [libiscsi]
     irq_exit+0xff/0x110
     smp_apic_timer_interrupt+0x3d/0x50
     apic_timer_interrupt+0x9d/0xb0
    RIP: 0010:__local_bh_enable_ip+0x74/0xa0
    RSP: 0018:ffffb0a60a26bc98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
    RAX: ffff901985f30000 RBX: 00000000fffffe00 RCX: 0000000000000002
    RDX: ffff901985f30000 RSI: 0000000000000001 RDI: ffffffffa9837ed4
    RBP: ffffb0a60a26bca8 R08: 0000000000000000 R09: 0000000000000000
    R10: ffff901985f30000 R11: 0000000000000002 R12: ffffffffc01b1ca3
    R13: ffff90198d8916a0 R14: ffff90199207d480 R15: ffff90199207d400
     </IRQ>
     ? __iscsi_conn_send_pdu+0x1e3/0x420 [libiscsi]
     ? dump_stack+0xaa/0xc6
     ? __local_bh_enable_ip+0x70/0xa0
     _raw_spin_unlock_bh+0x35/0x40
     __iscsi_conn_send_pdu+0x1e3/0x420 [libiscsi]
     iscsi_send_nopout+0xbe/0x110 [libiscsi]
     iscsi_eh_cmd_timed_out+0x2b7/0x2c0 [libiscsi]
     scsi_times_out+0x86/0x2b0
     blk_rq_timed_out+0x1f/0x60
     blk_timeout_work+0x11b/0x170
     process_one_work+0x250/0x690
     worker_thread+0x4e/0x3c0
     kthread+0x117/0x150
     ? process_one_work+0x690/0x690
     ? kthread_create_on_node+0x70/0x70
     ret_from_fork+0x2a/0x40

blk_timeout_work calls blk_rq_timed_out inside a spin_lock_irqsave, so
it should not be possible to take the softirq at this point.  I
believe, but would like to someone to confirm, that it is the use of
spin_lock_bh/spin_unlock_bh that makes it possible to take the irq
here.  A deadlock results from spin_lock of session->frwd_lock in the
softirq while it is already held by same cpu, having been acquired in
iscsi_eh_cmd_timed_out.

I have seen another similar soft lockup trace:

    <IRQ>
    queued_spin_lock_slowpath
    _raw_spin_lock
    iscsi_complete_pdu
    iscsi_tcp_process_data_in
    ... network driver and tcp stack ...
    net_rx_action
    __do_softirq
    do_softirq_own_stack
    <EOI>
    do_softirq
    __local_bh_enable_ip
    _raw_spin_unlock_bh
    __iscsi_conn_send_pdu
    iscsi_eh_cmd_timed_out
    scsi_times_out
    blk_rq_timed_out
    blk_timeout_work
    ...

Which I think is also a deadlock involving the spin_lock of
session->frwd_lock, in this instance following case ISCSI_OP_NOOP_IN:
in __iscsi_complete_pdu.

Feedback on the proposed change, patch, and whether it would be
expected to address the soft lockups I have seen are most welcome.

Best regards,

  --Chris