On Wed, 2011-11-09 at 05:54 +0000, Nicholas A. Bellinger wrote: > From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> > > This patch converts qla_target.c code to use a single local scope > qla_tgt_wq setup during qla_tgt_init(), and drops qla_tgt->qla_tgt_wq > usage. It also makes tcm_qla2xxx_handle_data() use tcm_qla2xxx_free_wq > for handling aborted command response tcm_qla2xxx_handle_data() > > Cc: Christoph Hellwig <hch@xxxxxx> > Cc: Roland Dreier <roland@xxxxxxxxxxxxxxx> > Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> > --- > drivers/scsi/qla2xxx/qla_target.c | 17 +++++++++++------ > drivers/scsi/qla2xxx/qla_target.h | 1 - > drivers/target/tcm_qla2xxx/tcm_qla2xxx_fabric.c | 2 +- > 3 files changed, 12 insertions(+), 8 deletions(-) > Hi Christoph, What extra parameters do you recommend using for the alloc_workqueue() setup here..? While doing some initial fio large block performance tests with a 2x port 8 Gb/sec 25xx parts with this evening, I noticed the per HW port qla_tgt->qla_tgt_wq workqueues using alloc_workqueue("qla_tgt_wq, WQ_UNBOUND, 1); seem to have a slight performance edge over this patch to start using a single global qla_tgt_wq setup in qla_target.c with alloc_workqueue("qla_tgt_wq, 0, 0). It's on the order of a ~300 MB/sec difference between the two, with the per qla_tgt->qla_tgt_wq running very near what the backend is capable of at ~1500 MB/sec, and this patch to convert to a single qla_tgt_wq with the same tests are ~1200 MB/sec. For the slower case with this patch, a single kworker thread is running @ 100% CPU utilization with the wq defaults. Using per HW port context qla_tgt->qla_tgt_wq with WQ_UNBOUND, 1, multiple kworkers are running at ~40% utilization, and AFAICT seem to be doing a better job of distributing load across multiple qla_hw_data ports. Any thoughts on what might be limiting the single global qla_tgt_wq in this patch against per HW port qla_tgt_wq dispatch..? Thanks, --nab > diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c > index dcf58ab..857a8e2 100644 > --- a/drivers/scsi/qla2xxx/qla_target.c > +++ b/drivers/scsi/qla2xxx/qla_target.c > @@ -117,6 +117,7 @@ static int qla_tgt_unreg_sess(struct qla_tgt_sess *sess); > static struct kmem_cache *qla_tgt_cmd_cachep; > static struct kmem_cache *qla_tgt_mgmt_cmd_cachep; > static mempool_t *qla_tgt_mgmt_cmd_mempool; > +static struct workqueue_struct *qla_tgt_wq; > /* > * From qla2xxx/qla_iobc.c and used by various qla_target.c logic > */ > @@ -4880,11 +4881,6 @@ int qla_tgt_add_target(struct qla_hw_data *ha, struct scsi_qla_host *base_vha) > INIT_WORK(&tgt->srr_work, qla_tgt_handle_srr_work); > atomic_set(&tgt->tgt_global_resets_count, 0); > > - tgt->qla_tgt_wq = alloc_workqueue("qla_tgt_wq", WQ_UNBOUND, 0); > - if (!tgt->qla_tgt_wq) > - return -ENOMEM; > - > - printk("Setup tgt->qla_tgt_wq: %p for qla_hw_data: %p\n", tgt->qla_tgt_wq, ha); > ha->qla_tgt = tgt; > > if (IS_FWI2_CAPABLE(ha)) { > @@ -4927,7 +4923,6 @@ int qla_tgt_remove_target(struct qla_hw_data *ha, struct scsi_qla_host *vha) > "existing target", vha->vp_idx); > return 0; > } > - destroy_workqueue(ha->qla_tgt->qla_tgt_wq); > > ql_dbg(ql_dbg_tgt, vha, 0xe037, "Unregistering target for host %ld(%p)", > vha->host_no, ha); > @@ -5451,8 +5446,17 @@ int __init qla_tgt_init(void) > goto out_mgmt_cmd_cachep; > } > > + qla_tgt_wq = alloc_workqueue("qla_tgt_wq", 0, 0); > + if (!qla_tgt_wq) { > + pr_warn(KERN_ERR "alloc_workqueue for qla_tgt_wq failed\n"); > + ret = -ENOMEM; > + goto out_cmd_mempool; > + } > + > return 0; > > +out_cmd_mempool: > + mempool_destroy(qla_tgt_mgmt_cmd_mempool); > out_mgmt_cmd_cachep: > kmem_cache_destroy(qla_tgt_mgmt_cmd_cachep); > out: > @@ -5462,6 +5466,7 @@ out: > > void __exit qla_tgt_exit(void) > { > + destroy_workqueue(qla_tgt_wq); > mempool_destroy(qla_tgt_mgmt_cmd_mempool); > kmem_cache_destroy(qla_tgt_mgmt_cmd_cachep); > kmem_cache_destroy(qla_tgt_cmd_cachep); > diff --git a/drivers/scsi/qla2xxx/qla_target.h b/drivers/scsi/qla2xxx/qla_target.h > index 22d9bc0..85b8cca 100644 > --- a/drivers/scsi/qla2xxx/qla_target.h > +++ b/drivers/scsi/qla2xxx/qla_target.h > @@ -870,7 +870,6 @@ struct qla_tgt { > spinlock_t sess_work_lock; > struct list_head sess_works_list; > struct work_struct sess_work; > - struct workqueue_struct *qla_tgt_wq; > > imm_ntfy_from_isp_t link_reinit_iocb; > wait_queue_head_t waitQ; > diff --git a/drivers/target/tcm_qla2xxx/tcm_qla2xxx_fabric.c b/drivers/target/tcm_qla2xxx/tcm_qla2xxx_fabric.c > index c738a1c..49b3e69 100644 > --- a/drivers/target/tcm_qla2xxx/tcm_qla2xxx_fabric.c > +++ b/drivers/target/tcm_qla2xxx/tcm_qla2xxx_fabric.c > @@ -771,7 +771,7 @@ int tcm_qla2xxx_handle_data(struct qla_tgt_cmd *cmd) > > se_cmd->scsi_sense_reason = TCM_CHECK_CONDITION_ABORT_CMD; > INIT_WORK(&cmd->work, tcm_qla2xxx_do_rsp); > - queue_work(cmd->tgt->qla_tgt_wq, &cmd->work); > + queue_work(tcm_qla2xxx_free_wq, &cmd->work); > return 0; > } > /* -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html