On 11/25/22 3:34 AM, Dmitry Bogdanov wrote: > On Mon, Nov 21, 2022 at 01:25:55PM -0600, Mike Christie wrote: >> >> On 11/21/22 7:35 AM, Dmitry Bogdanov wrote: >>>> >>> I will vote to your old patch. qla2xxx was fixed long time ago. >> >> What is the qla fix? I think we still leak. In commit >> >> commit 605e74025f953b995a3a241ead43bde71c1c99b5 >> Author: Mike Christie <michael.christie@xxxxxxxxxx> >> Date: Sun Nov 1 12:59:31 2020 -0600 >> >> scsi: qla2xxx: Move sess cmd list/lock to driver >> >> when I changed the locking I had added the check: >> >> static void tcm_qla2xxx_aborted_task(struct se_cmd *se_cmd) >> { >> struct qla_tgt_cmd *cmd; >> unsigned long flags; >> >> if (se_cmd->se_cmd_flags & SCF_SCSI_TMR_CDB) >> return; >> > Yes, I was thinking about that commit. >> because tmrs are not on the sess_cmd_list that's accessed the >> next line down. We don't crash as a result, but I think we need >> to add code to send the cleanup command to the FW. Bodo and I >> were working on that part, but someone with more qla experience >> needed to work on it so it could be properly tested. We didn't >> hear back from the qla engineers so progress had stalled. >> > Yes, you are right, FW expects some response on every ABORT IOCB > to clear its resources. > > I can prepare the patch for qla2xxx. That would be awesome. > > But still, I do not see a sense of new Bodo's solution. Drivers are crashing in the aborted_task callout. His idea was a workaround/hack so we would call the queue_tm_rsp callout instead of aborted_ask like we did before which would avoid the crashes but allow us to keep the async behavior. To the initiator's it would work like before where it looks like a race where the abort response is received right after it has sent the lun reset. His assumption is that initiators handled that before so it would continue to work. If you fix qla, then we can easily fix loop and xen and we can do the correct behavior and keep the async feature.