Re: [PATCH v4 07/37] target: Introduce a new workqueue for TMF processing

"Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> · Thu, 09 Feb 2017 01:34:59 -0800

On Wed, 2017-02-08 at 14:24 -0800, Bart Van Assche wrote:
> Several SCSI transport protocols require that SCSI task management
> functions are processed in the order in which these have been
> submitted by the initiator system. Hence introduce a workqueue
> per session for TMF processing. Do not specify WQ_MEM_RECLAIM since
> only the tcm_loop driver can queue TMF work from inside the memory
> allocation path and since the tcm_loop driver is only used to debug
> the SCSI target code.

This change effects exiting code negatively in two different ways.

Comments below.

> 
> Signed-off-by: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
> Cc: Christoph Hellwig <hch@xxxxxx>
> Cc: Hannes Reinecke <hare@xxxxxxxx>
> Cc: David Disseldorp <ddiss@xxxxxxx>
> ---
>  drivers/target/target_core_transport.c | 11 ++++++++++-
>  include/target/target_core_base.h      |  1 +
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
> index 1cadc9eefa21..084a7fbfc72e 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -241,6 +241,13 @@ struct se_session *transport_init_session(enum target_prot_op sup_prot_ops)
>  	spin_lock_init(&se_sess->sess_cmd_lock);
>  	se_sess->sup_prot_ops = sup_prot_ops;
>  
> +	se_sess->tmf_wq = alloc_workqueue("tmf-%p", WQ_UNBOUND, 1, se_sess);
> +	if (!se_sess->tmf_wq) {
> +		pr_err("%s: workqueue allocation failed\n", __func__);
> +		kfree(se_sess);
> +		return ERR_PTR(-ENOMEM);
> +	}
> +

First, alloc_workqueue() and destroy_workqueue use a global
wq_pool_mutex when updating the global workqueue list.

Which means session creation and session deletion for target-core
effectively has a global synchronization point across all endpoints and
all fabric drivers.

My use-case for scale-out with LIO requires to run at least 1000
unique /sys/kernel/config/target/$FABRIC/$WWN/$TPGT/ endpoints per node
that must be able to operate independently of one another, with a least
one active session per endpoint.

So from that perspective, this change to add alloc_workqueue() +
destroy_workqueue() in the session creation + deletion paths is a
non-starter for scale-out.

>  	return se_sess;
>  }
>  EXPORT_SYMBOL(transport_init_session);
> @@ -507,6 +514,8 @@ void transport_free_session(struct se_session *se_sess)
>  		se_sess->se_node_acl = NULL;
>  		target_put_nacl(se_nacl);
>  	}
> +	if (se_sess->tmf_wq)
> +		destroy_workqueue(se_sess->tmf_wq);
>  	if (se_sess->sess_cmd_map) {
>  		percpu_ida_destroy(&se_sess->sess_tag_pool);
>  		kvfree(se_sess->sess_cmd_map);
> @@ -3129,7 +3138,7 @@ int transport_generic_handle_tmr(
>  	spin_unlock_irqrestore(&cmd->t_state_lock, flags);
>  
>  	INIT_WORK(&cmd->work, target_tmr_work);
> -	queue_work(cmd->se_dev->tmr_wq, &cmd->work);
> +	queue_work(cmd->se_sess->tmf_wq, &cmd->work);
>  	return 0;
>  }

The concern for LUN_RESET is multiple kthreads draining per se_device
TMRs and se_cmds from se_device->state_list in target_tmr_work() can now
happen in parallel in multiple kthreads (for the same device) within the
per se_session workqueue.

That is a rather fundamental change to the way existing TMR code works,
so without an actual bug this change addresses, or use case where it's
necessary I'm dropping this patch for now. 

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html