On 9/28/23 9:54 AM, Chengfeng Ye wrote: > iscsi_complete_pdu() is called by several rx callback, under > softirq context. Thus the callsite of it inside qla4xxx_task_work() > should better disable bottom half, as work queue is executed under > process context, it needs to protect race with softirq context > locking to avoid deadlock. > > <potential deadlock #1> > qla4xxx_task_work() > --> iscsi_complete_pdu() > --> spin_lock(&conn->session->back_lock); > <interrupt> > --> iscsi_tcp_data_recv_done() > --> iscsi_complete_pdu() > --> spin_lock(&conn->session->back_lock) (deadlock) > This will not happen, because qla4xxx does not use the libiscsi_tcp code. If you had a iscsi session with qla4xxx and another iscsi session with iscsi_tcp, then you could get an interrupt like this, but the "conn->session->back_lock" that's taken above would be for the iscsi_tcp session. So you would need to make sure that the above can't happen for iscsi_tcp an iscsi_tcp session. Same with the below analysis. > <potential deadlock #2> > qla4xxx_task_work() > --> iscsi_complete_pdu() > --> __iscsi_complete_pdu() > --> spin_lock(&conn->session->frwd_lock) > <interrupt> > --> iscsi_tcp_data_recv_done() > --> iscsi_complete_pdu() > --> __iscsi_complete_pdu() > --> spin_lock(&conn->session->frwd_lock) (deadlock) > > This flaw was found by an experimental static analysis tool I am > developing for irq-related deadlock. > > To avoid the problem, disable bh inside qla4xxx_task_work() before > calling iscsi_complete_pdu(). > > Signed-off-by: Chengfeng Ye <dg573847474@xxxxxxxxx> > --- > drivers/scsi/qla4xxx/ql4_os.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/scsi/qla4xxx/ql4_os.c b/drivers/scsi/qla4xxx/ql4_os.c > index 675332e49a7b..c60781148e6c 100644 > --- a/drivers/scsi/qla4xxx/ql4_os.c > +++ b/drivers/scsi/qla4xxx/ql4_os.c > @@ -3382,7 +3382,9 @@ static void qla4xxx_task_work(struct work_struct *wdata) > hdr->itt = itt; > data = task_data->resp_buffer + hdr_len; > data_len = task_data->resp_len - hdr_len; > + local_bh_disable(); > iscsi_complete_pdu(conn, hdr, data, data_len); > + local_bh_enable(); > break; > default: > ql4_printk(KERN_ERR, ha, "Passthru failed status = 0x%x\n",