On 10/13/21 12:51 PM, Mike Christie wrote: > On 10/13/21 8:21 AM, Konstantin Shelekhin wrote: >> Hi, >> >> I really need the collective wisdom. >> >> Not long ago we've uncovered the problem with iSCSI and ABORT TASK >> handling. Currently it's not possible to abort a WRITE_10 command in >> TRANSPORT_WRITE_PENDING state, because ABORT TASK will hang itself in >> the process: >> >> # dmesg | tail -2 >> [ 83.563505] ABORT_TASK: Found referenced iSCSI task_tag: 3372979269 >> [ 84.593545] Unable to recover from DataOut timeout while in ERL=0, closing iSCSI connection for I_T Nexus <nexus> >> >> # ps aux | awk '$8 ~/D/' >> root 32 0.0 0.0 0 0 ? D 15:19 0:00 [kworker/0:1+events] >> root 1187 0.0 0.0 0 0 ? D 15:20 0:00 [iscsi_ttx] >> >> # cat /proc/32/stack >> [<0>] target_put_cmd_and_wait+0x68/0xa0 >> [<0>] core_tmr_abort_task.cold+0x16b/0x192 >> [<0>] target_tmr_work+0x9e/0xe0 >> [<0>] process_one_work+0x1d4/0x370 >> [<0>] worker_thread+0x48/0x3d0 >> [<0>] kthread+0x122/0x140 >> [<0>] ret_from_fork+0x22/0x30 >> >> # cat /proc/1187/stack >> [<0>] __transport_wait_for_tasks+0xaf/0x100 >> [<0>] transport_generic_free_cmd+0xe9/0x180 >> [<0>] iscsit_free_cmd+0x50/0xb0 >> [<0>] iscsit_close_connection+0x47d/0x8c0 >> [<0>] iscsit_take_action_for_connection_exit+0x6f/0xf0 >> [<0>] iscsi_target_tx_thread+0x184/0x200 >> [<0>] kthread+0x122/0x140 >> [<0>] ret_from_fork+0x22/0x30 >> >> What happens: >> >> 1. Initiator sends WRITE_10 CDB >> 2. Target parses the CDB and sends R2T >> 3. Target starts the Data-Out timer >> 4. Initiator sends ABORT TASK; no new data from Initiator after this >> 5. Target starts aborting WRITE_10, gets into core_tmr_abort_task() >> and starts waiting for the request completion >> 6. Nothing happens >> 7. The Data-Out timers expires, connection teardown starts and gets >> stuck waiting for ABORT TASK that waits for WRITE_10 >> >> The ABORT TASK processing looks roughly like this: >> >> iscsi_rx_opcode >> iscsi_handle_task_mgt_cmd >> iscsi_tmr_abort_task >> transport_generic_handle_tmr >> if (tmr_cmd->transport_state & CMD_T_ABORTED) >> target_handle_abort >> else >> target_tmr_work >> if (tmr_cmd->transport_state & CMD_T_ABORTED) >> target_handle_abort >> else >> core_tmr_abort_task >> ret = __target_check_io_state >> if (write_cmd->transport_state & CMD_T_STOP) >> return -1 >> write_cmd->transport_state |= CMD_T_ABORTED >> return 0 >> if (!ret) >> list_move_tail(&write_cmd->state_list, &aborted) >> target_put_cmd_and_wait(&write_cmd) >> >> As I see it, the main problem is that the abort path can't initiate the >> command termination, it simply waits for the request to handle this on >> the execution path like in target_execute_cmd(): >> >> target_execute_cmd >> target_cmd_interrupted >> INIT_WORK(&cmd->work, target_abort_work) >> >> However, in this case the request is not going to be executed because >> Initiator will not send the Data-Out buffer. >> >> I have a couple of ideas on how to fix this, but they all look kinda >> ugly. The one that currently works around this for me: >> >> core_tmr_abort_task(): >> >> [...] >> >> spin_lock_irqsave(&se_cmd->t_state_lock, flags); >> write_pending = se_cmd->t_state == TRANSPORT_WRITE_PENDING; >> spin_unlock_irqrestore(&se_cmd->t_state_lock, flags); >> >> if (write_pending && se_cmd->se_tfo->abort_write_pending) >> se_cmd->se_tfo->abort_write_pending(se_cmd); >> >> target_put_cmd_and_wait(se_cmd); >> >> [...] >> >> The new method abort_write_pending() is defined only for iSCSI and calls >> target_handle_abort(). However, this opens up another can of worms >> because this code heavily races with R2T sending and requires a couple >> of checks to "work most of the time". Not ideal, by far. >> >> I can make this one better by introducing R2T list draining that ensures >> the proper order during cleanup, but maybe there is a much easier way >> that I'm not seeing here. > > Ccing Maurizio to make sure I don't add his original bug back. > > If I understand you, I think I added this bug in: > > commit f36199355c64a39fe82cfddc7623d827c7e050da > Author: Mike Christie <michael.christie@xxxxxxxxxx> > Date: Fri Nov 13 19:46:18 2020 -0600 > > scsi: target: iscsi: Fix cmd abort fabric stop race > > With that patch if the abort or a lun reset has got to lio core then we > are going to be stuck waiting for the data which won't come because we > killed the iscsi threads. > > Can go back to always having the iscsi target clean up the cmd, but if > LIO has started to abort the cmd we take an extra ref so we don't free > the cmd from under each other. > > This patch is completely untested: > > > diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c > index 2c54c5d8412d..d221e9be7468 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -4090,12 +4090,13 @@ static void iscsit_release_commands_from_conn(struct iscsi_conn *conn) > spin_lock_irq(&se_cmd->t_state_lock); > if (se_cmd->transport_state & CMD_T_ABORTED) { > /* > - * LIO's abort path owns the cleanup for this, > - * so put it back on the list and let > - * aborted_task handle it. > + * The LIO TMR handler owns the cmd but if > + * we were waiting for data from the initiator > + * then we need to internally cleanup to be > + * able to complete it. Get an extra ref so > + * we don't free the cmd from under LIO core. > */ > - list_move_tail(&cmd->i_conn_node, > - &conn->conn_cmd_list); > + target_get_sess_cmd(se_cmd, false); > } else { > se_cmd->transport_state |= CMD_T_FABRIC_STOP; > } > Another alternative would be to have iscsi check if it was waiting on data before sending a TMR to LIO core. If it is, then it would just cleanup internally and complete the TMR with success/failed depending on the TMR or drop it and let the initiator escalate.