On Wed, 2019-04-03 at 08:09 -0700, Bart Van Assche wrote: > On Wed, 2019-04-03 at 10:53 -0400, Laurence Oberman wrote: > > Update on this issue > > > > If I reset the SRP target server and do not reboot it I do not see > > the > > block-mq race with the SRP initiator. > > Just resetting the SRP target array and not runing the reboot > > avoids > > the shutdown scripts of ib_srpt and seems somehow to prevent the > > race > > with the initiator code. > > > > Given that I see very little churn and activity with LIO lately I > > dont > > know if its worth spending more time on trying to isolate this. > > > > If anybody has an SRP native array (like DDN) and can reboot it > > with > > 5.x plus, that would be a good test. > > > > I will reach out to DDN and ask if they can test it. > > Hi Laurence, > > If the initiator side crashes it means there is a bug in the kernel > at the > initiator side. It's not clear to me why you are commenting on the > target > side in this e-mail thread? > > Thanks, > > Bart. Hello Bart It does indeed seem to be an initiator issue but related to something when the target reboots. Resetting the target, the initiator survives. I did not know this until today. Until today was rebooting the targetserver to reproduce the race on the initiator side. Today I started testing the same way as I was when I repriduced the Qlogic lock recursion, and In noticed I could not longer reproduce it. In other words rebooting the target closes the SRP target ports in such a way that the initiator gets into the block-mq issue race. Using echo b or power reset the array just dissapears and tyhe initiator is fine. Apologies if I was not clear. Regards Laurence