Re: [PATCH] target: core: remove from tmr_list at lun unlink

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/7/21 9:21 AM, Dmitry Bogdanov wrote:
> Currently TMF commands are removed from de_device.dev_tmf_list at
> the very end of se_cmd lifecycle. But se_lun unlinks from se_cmd
> up on a command status (response) is queued in transport layer.
> It means that LUN and backend device can be deleted meantime and at
> the moment of repsonse completion a panic is occured:
> 
> target_tmr_work()
> 	cmd->se_tfo->queue_tm_rsp(cmd); // send abort_rsp to a wire
> 	transport_lun_remove_cmd(cmd) // unlink se_cmd from se_lun
> — // — // — // —
> <<<--- lun remove
> <<<--- core backend device remove
> — // — // — // —
> qlt_handle_abts_completion()
>   tfo->free_mcmd()
>     transport_generic_free_cmd()
>       target_put_sess_cmd()
>         core_tmr_release_req() {
>           if (dev) { // backend device, can not be null
>             spin_lock_irqsave(&dev->se_tmr_lock, flags); //<<<--- CRASH
> 
> Call Trace:
> NIP [c000000000e1683c] _raw_spin_lock_irqsave+0x2c/0xc0
> LR [c00800000e433338] core_tmr_release_req+0x40/0xa0 [target_core_mod]
> Call Trace:
> (unreliable)
> 0x0
> target_put_sess_cmd+0x2a0/0x370 [target_core_mod]
> transport_generic_free_cmd+0x6c/0x1b0 [target_core_mod]
> tcm_qla2xxx_complete_mcmd+0x28/0x50 [tcm_qla2xxx]
> process_one_work+0x2c4/0x5c0
> worker_thread+0x88/0x690
> 
> For FC protocol it is a race condition, but for iSCSI protocol it is
> easyly reproduced by manual sending iSCSI commands:
> - Send some SCSI sommand
> - Send Abort of that command over iSCSI
> - Remove LUN on target
> - Send next iSCSI command to acknowledge the Abort_Response
> - target panics
> 
> There is no sense to keep the command in tmr_list until response
> completion, so move the removal from tmr_list from the response
> completion to the response queueing when lun is unlinked.
> Move the removal from state list too as it is a subject to the same
> race condition.
> 
> Fixes: c66ac9db8d4a ("[SCSI] target: Add LIO target core v4.0.0-rc6")
> Reviewed-by: Roman Bolshakov <r.bolshakov@xxxxxxxxx>
> Signed-off-by: Dmitry Bogdanov <d.bogdanov@xxxxxxxxx>
> 
> ---
> The issue exists from the very begining.
> I uploaded a script that helps to reproduce the issue at
> https://urldefense.com/v3/__https://gist.github.com/logost/cb93df41dd2432454324449b390403c4__;!!GqivPVa7Brio!KN-N7Ult7Itn2AzY6LdP2vm0D81UIjpbCyOAH3uf2OoLGUykpGP3dJTQlLm9m71MjsxY$ 
> ---
>  drivers/target/target_core_tmr.c       |  9 ---------
>  drivers/target/target_core_transport.c | 19 +++++++++++++++++--
>  2 files changed, 17 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/target/target_core_tmr.c b/drivers/target/target_core_tmr.c
> index 7347285471fa..323a173010c1 100644
> --- a/drivers/target/target_core_tmr.c
> +++ b/drivers/target/target_core_tmr.c
> @@ -50,15 +50,6 @@ EXPORT_SYMBOL(core_tmr_alloc_req);
>  
>  void core_tmr_release_req(struct se_tmr_req *tmr)
>  {
> -	struct se_device *dev = tmr->tmr_dev;
> -	unsigned long flags;
> -
> -	if (dev) {
> -		spin_lock_irqsave(&dev->se_tmr_lock, flags);
> -		list_del_init(&tmr->tmr_list);
> -		spin_unlock_irqrestore(&dev->se_tmr_lock, flags);
> -	}
> -
>  	kfree(tmr);
>  }
>  
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 5ecb9f18a53d..4d9968f43cf1 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -667,6 +667,20 @@ static void target_remove_from_state_list(struct se_cmd *cmd)
>  	spin_unlock_irqrestore(&dev->queues[cmd->cpuid].lock, flags);
>  }
>  
> +static void target_remove_from_tmr_list(struct se_cmd *cmd)
> +{
> +	struct se_device *dev = NULL;
> +	unsigned long flags;
> +

This is just a nit. Maybe just do:

struct se_device *dev = NULL;
unsigned long flags;

if (!(cmd->se_cmd_flags & SCF_SCSI_TMR_CDB))
	return;

dev = cmd->se_tmr_req->tmr_dev;
spin_lock_irqsave(&dev->se_tmr_lock, flags);
list_del_init(&cmd->se_tmr_req->tmr_list);
spin_unlock_irqrestore(&dev->se_tmr_lock, flags);


It will look like target_remove_from_state_list. Below I was expecting
some sort of twist ending with the extra if in there. But we just wanted
to run the function for tmrs.


> +	if (cmd->se_cmd_flags & SCF_SCSI_TMR_CDB)
> +		dev = cmd->se_tmr_req->tmr_dev;
> +
> +	if (dev) {
> +		spin_lock_irqsave(&dev->se_tmr_lock, flags);
> +		list_del_init(&cmd->se_tmr_req->tmr_list);
> +		spin_unlock_irqrestore(&dev->se_tmr_lock, flags);
> +	}
> +}
>  /*
>   * This function is called by the target core after the target core has
>   * finished processing a SCSI command or SCSI TMF. Both the regular command
> @@ -678,8 +692,6 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
>  {
>  	unsigned long flags;
>  
> -	target_remove_from_state_list(cmd);
> -
>  	/*
>  	 * Clear struct se_cmd->se_lun before the handoff to FE.
>  	 */

Right below this line we clear se_cmd->selun. I think that should go in
transport_lun_remove_cmd since it's the function that does the put and now
with your changes we handle all the last refs there.


> @@ -719,6 +731,9 @@ static void transport_lun_remove_cmd(struct se_cmd *cmd)
>  	if (!lun)
>  		return;
>  
> +	target_remove_from_state_list(cmd);
> +	target_remove_from_tmr_list(cmd);
> +
>  	if (cmpxchg(&cmd->lun_ref_active, true, false))
>  		percpu_ref_put(&lun->lun_ref);
>  }
>



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux