Re: [PATCH v2] target: core: remove from tmr_list at lun unlink

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/16/21 4:21 AM, Dmitry Bogdanov wrote:
> Currently TMF commands are removed from de_device.dev_tmf_list at
> the very end of se_cmd lifecycle. But se_lun unlinks from se_cmd
> up on a command status (response) is queued in transport layer.
> It means that LUN and backend device can be deleted meantime and at
> the moment of repsonse completion a panic is occured:
> 
> target_tmr_work()
> 	cmd->se_tfo->queue_tm_rsp(cmd); // send abort_rsp to a wire
> 	transport_lun_remove_cmd(cmd) // unlink se_cmd from se_lun
> - // - // - // -
> <<<--- lun remove
> <<<--- core backend device remove
> - // - // - // -
> qlt_handle_abts_completion()
>   tfo->free_mcmd()
>     transport_generic_free_cmd()
>       target_put_sess_cmd()
>         core_tmr_release_req() {
>           if (dev) { // backend device, can not be null
>             spin_lock_irqsave(&dev->se_tmr_lock, flags); //<<<--- CRASH
> 
> Call Trace:
> NIP [c000000000e1683c] _raw_spin_lock_irqsave+0x2c/0xc0
> LR [c00800000e433338] core_tmr_release_req+0x40/0xa0 [target_core_mod]
> Call Trace:
> (unreliable)
> 0x0
> target_put_sess_cmd+0x2a0/0x370 [target_core_mod]
> transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
> tcm_qla2xxx_complete_mcmd+0x28/0x50 [tcm_qla2xxx]
> process_one_work+0x2c4/0x5c0
> worker_thread+0x88/0x690
> 
> For FC protocol it is a race condition, but for iSCSI protocol it is
> easyly reproduced by manual sending iSCSI commands:
> - Send some SCSI sommand
> - Send Abort of that command over iSCSI
> - Remove LUN on target
> - Send next iSCSI command to acknowledge the Abort_Response
> - target panics
> 
> There is no sense to keep the command in tmr_list until response
> completion, so move the removal from tmr_list from the response
> completion to the response queueing when lun is unlinked.
> Move the removal from state list too as it is a subject to the same
> race condition.
> 
> Fixes: c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6")
> Reviewed-by: Roman Bolshakov <r.bolshakov@xxxxxxxxx>
> Signed-off-by: Dmitry Bogdanov <d.bogdanov@xxxxxxxxx>
> 
> ---
> v2:
>  fix cmd stuck in tmr list in error case in iscsi
>  move clearing cmd->se_lun to transport_lun_remove_cmd
> 
> The issue exists from the very begining.
> I uploaded a scapy script that helps to reproduce the issue at
> https://urldefense.com/v3/__https://gist.github.com/logost/cb93df41dd2432454324449b390403c4__;!!GqivPVa7Brio!MjANCRIp5ZrtKYonxEKclROOwOe7s-adKHLiUd2Njis6m1774RMpLGNkHyapFf68BwFr$ 
> ---
>  drivers/target/iscsi/iscsi_target.c    |  2 +-
>  drivers/target/target_core_tmr.c       |  9 --------
>  drivers/target/target_core_transport.c | 29 +++++++++++++++++++-------
>  3 files changed, 23 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c
> index d0e7ed8f28cc..3311b031a812 100644
> --- a/drivers/target/iscsi/iscsi_target.c
> +++ b/drivers/target/iscsi/iscsi_target.c
> @@ -2142,7 +2142,7 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
>  	 * TMR TASK_REASSIGN.
>  	 */
>  	iscsit_add_cmd_to_response_queue(cmd, conn, cmd->i_state);
> -	target_put_sess_cmd(&cmd->se_cmd);
> +	transport_generic_free_cmd(&cmd->se_cmd, false);
>  	return 0;
>  }

Doh. I see how I got all confused. I guess this path leaks the lun_ref
taken by transport_lookup_tmr_lun. It looks like an old issue and nothing
to do with your patch.

I'm not sure if we are supposed to be calling transport_generic_free_cmd
twice. It looks like it works ok, because your patch added the "cmd->se_lun = NULL"
in transport_lun_remove_cmd, so we won't do a double list deletion.
It feels dirty though. I can feel Bart saying, "Mike did you see the comment
at the top of the function". :)

Maybe it's best to more cleanly unwind what was setup in the rror path. I think we
can fix lun_ref leak too.

So instead of doing transport_generic_free_cmd above do transport_lun_remove_cmd
to match/undo the transport_lookup_tmr_lun call in iscsit_handle_task_mgt_cmd?



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux