> -----Original Message----- > From: Jitendra Bhivare [mailto:jitendra.bhivare@xxxxxxxxxxxx] > Sent: Wednesday, August 10, 2016 6:16 PM > To: 'Mike Christie'; 'Martin K. Petersen' > Cc: 'linux-scsi@xxxxxxxxxxxxxxx' > Subject: RE: [PATCH 02/28] be2iscsi: Replace _bh with _irqsave/irqrestore > > > -----Original Message----- > > From: Mike Christie [mailto:mchristi@xxxxxxxxxx] > > Sent: Tuesday, August 09, 2016 11:19 PM > > To: Jitendra Bhivare; Martin K. Petersen > > Cc: linux-scsi@xxxxxxxxxxxxxxx > > Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > _irqsave/irqrestore > > > > On 08/09/2016 03:29 AM, Jitendra Bhivare wrote: > > >> -----Original Message----- > > >> From: Martin K. Petersen [mailto:martin.petersen@xxxxxxxxxx] > > >> Sent: Tuesday, August 09, 2016 6:35 AM > > >> To: Mike Christie > > >> Cc: Jitendra Bhivare; linux-scsi@xxxxxxxxxxxxxxx > > >> Subject: Re: [PATCH 02/28] be2iscsi: Replace _bh with > > > _irqsave/irqrestore > > >> > > >>>>>>> "Mike" == Mike Christie <mchristi@xxxxxxxxxx> writes: > > >> > > >>>> In beiscsi_alloc_pdu, _bh versions of spin_lock are being used > > >>>> for protecting SGLs and WRBs. _bh versions are needed as the > > >>>> function gets invoked in process context and BLOCK_IOPOLL softirq. > > >>>> > > >>>> In spin_unlock_bh, after releasing the lock and enabling BH, > > >>>> do_softirq is called which executes till last SOFTIRQ. > > >>>> > > >>>> beiscsi_alloc_pdu is called under session lock. Through block > > >>>> layer, iSCSI stack in some cases send IOs with interrupts disabled. > > >>>> In such paths, > > >> > > >> > > >> Mike> What path is this? Is this with mq enabled or disabled? > > >> > > >> Jitendra? > > >> > > >> -- > > >> Martin K. Petersen Oracle Linux Engineering > > > [JB] Sorry for the delayed response, there was some issue with my > > > mail client. > > > There are paths block layer where IRQs are disabled with > > > request_queue queue_lock. > > > - blk_timeout_work : this triggers NOP-OUT thru' > > > iscsi_eh_cmd_timed_out. > > > - blk_execute_rq_nowait > > > > I am not sure I understand the problem/solution. Why does the patch > > only need to modify be2iscsi's locking? Why wouldn't you need to also > > change the libiscsi session lock locking? You don't need irqs to be > > running right? You are just doing this so the locking calls (irq vs bh) > > matches > up? > > > > Don't you also need to fix the non-mq IO path. scsi_request_fn needs a > > fix to use spin_lock_irqsave/spin_unlock_irqrestore because > > blk_execute_rq_nowait already disables them right? > > > [JB] The issue is hard lockup seen on a CPU which is waiting for session > lock > taken by another CPU processing do_softirq. This do_softirq is getting > called > from be2iscsi spin_unlock_bh. > > The contention gets easily exposed with be2iscsi as it data structures are > accessed in process context and IRQ_POLL softirq. > > iscsi_eh_cmd_timed_out gets saved because it is using base spin_lock > version > for session lock (frwd and back). > iscsi_queuecommand is using _bh which is bit scary. > > Yes, that needs to be fixed too... and then this other path looks buggy > too - > blk_run_queue (takes queue_lock with irqsave) ->scsi_request_fn (unlocks > it with > _irq) [JB] In this case, the other CPU executing do_softirq was too sending IO under session lock. When be2iscsi does spin_unlock_bh after getting resources for the IO, unlock_bh finds pending IRQ_POLL softirq. The issue is more to do with driver using _unlock_bh. It is causing a WARN_ON too in unlock_bh where ever used for disabled IRQs. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html