Nicholas, Answer is below. Regards, Quinn Tran On 5/23/14 7:33 PM, "Nicholas A. Bellinger" <nab@xxxxxxxxxxxxxxx> wrote: >Hi Qlogic folks, > >A question for you below.. > >On Sat, 2014-05-24 at 00:43 +0000, Nicholas A. Bellinger wrote: >> From: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> >> >> This patch converts qla2xxx target code to use generic percpu_ida >> tag allocation provided by target-core, thus removing the original >> kmem_cache_zalloc() for each struct qla_tgt_cmd descriptor in the >> incoming ATIO packet fast-path. >> >> This includes the conversion of qlt_handle_cmd_for_atio() to perform >> qla_tgt_sess lookup before dispatching a command descriptor into >> qla_tgt_wq process context, along with handling the case where no >> active session exists, and subsequently kicking off a seperate >> process context for qlt_create_sess_from_atio() to create a new one. >> >> It also includes moving tag allocation into generic code within >> qlt_get_tag(), so that the same logic can be shared between >> qlt_handle_cmd_for_atio() + qlt_create_sess_from_atio() contexts. >> Also, __qlt_do_work() has been made generic between both normal >> process context in qlt_do_work() + qlt_create_sess_from_atio(). >> >> Next, update qlt_free_cmd() to release the percpu-ida tags, and >> drop the now-unused global qla_tgt_cmd_cachep. >> >> Finally in tcm_qla2xxx code, tcm_qla2xxx_check_initiator_node_acl() >> has been updated to use transport_init_session_tags() along with a >> hardcoded TCM_QLA2XXX_DEFAULT_TAGS=512 as the number of qla_tgt_cmd >> descriptors to pre-allocate per qla_tgt_sess instance. >> >> Cc: Saurav Kashyap <saurav.kashyap@xxxxxxxxxx> >> Cc: Quinn Tran <quinn.tran@xxxxxxxxxx> >> Cc: Giridhar Malavali <giridhar.malavali@xxxxxxxxxx> >> Cc: Chad Dupuis <chad.dupuis@xxxxxxxxxx> >> Cc: Roland Dreier <roland@xxxxxxxxxx> >> Cc: Christoph Hellwig <hch@xxxxxx> >> Signed-off-by: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> >> --- >> drivers/scsi/qla2xxx/qla_target.c | 195 >>++++++++++++++++++++++++------------ >> drivers/scsi/qla2xxx/qla_target.h | 6 ++ >> drivers/scsi/qla2xxx/tcm_qla2xxx.c | 4 +- >> drivers/scsi/qla2xxx/tcm_qla2xxx.h | 2 + >> 4 files changed, 142 insertions(+), 65 deletions(-) >> > ><SNIP> > >> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.c >>b/drivers/scsi/qla2xxx/tcm_qla2xxx.c >> index 68fb66f..34db344 100644 >> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.c >> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.c >> @@ -1482,7 +1482,9 @@ static int tcm_qla2xxx_check_initiator_node_acl( >> } >> se_tpg = &tpg->se_tpg; >> >> - se_sess = transport_init_session(TARGET_PROT_NORMAL); >> + se_sess = transport_init_session_tags(TCM_QLA2XXX_DEFAULT_TAGS, >> + sizeof(struct qla_tgt_cmd), >> + TARGET_PROT_NORMAL); >> if (IS_ERR(se_sess)) { >> pr_err("Unable to initialize struct se_session\n"); >> return PTR_ERR(se_sess); >> diff --git a/drivers/scsi/qla2xxx/tcm_qla2xxx.h >>b/drivers/scsi/qla2xxx/tcm_qla2xxx.h >> index 33aaac8..b0a3ea5 100644 >> --- a/drivers/scsi/qla2xxx/tcm_qla2xxx.h >> +++ b/drivers/scsi/qla2xxx/tcm_qla2xxx.h >> @@ -4,6 +4,8 @@ >> #define TCM_QLA2XXX_VERSION "v0.1" >> /* length of ASCII WWPNs including pad */ >> #define TCM_QLA2XXX_NAMELEN 32 >> +/* Number of pre-allocated per-session tags */ >> +#define TCM_QLA2XXX_DEFAULT_TAGS 512 >> > >So a question wrt to the TCM_QLA2XXX_DEFAULT_TAGS value above used to >determine the number of qla_tgt_cmd descriptors to pre-allocate at >qla_tgt_sess creation time.. > >This value needs to line up with the total number of possible incoming >ATIO packets, plus the number of qla_tgt_cmd descriptors that have not >been acknowledged with a CTIO packet. Typically with other fabrics that >have been converted to use percpu_ida, we over-allocate the number of >per-session tags in order to completely avoid the slow-path where >percpu_ida has to steal tags from another CPU. > >AFAICT, there is no way at the qla_target driver level to enforce >per-session queue_depth + reflect the depth at the initiator port, so >the number of tags needs to be the worst case number of HW descriptors >that a single session can consume + number of unacknowledged >descriptors. QT> QLA driver & FW currently do not have queue_depth control at a per session level. Instead, the descriptor pool (i.e. Exchange pool) is manage by FW at the Port level. > >So the question is, is there already a define somewhere in qla2xxx code >that TCM_QLA2XXX_DEFAULT_TAGS can use as a starting point..? If not, >what is the total number of outstanding commands that a single session >(or single port..?) can expect to handle at a given time..? QT> Typically, the worst case value we see is 2048 for each port. It's currently not #define. This value can change over time. A good starting default value for TCM_QLA2XXX_DEFAULT_TAGS would be 2048 + %2 pad = 2088. Extra size note, the true value should be extracted from "ha->fw_xcb_count", if this field is set. Otherwise, default back to TCM_QLA2XXX_DEFAULT_TAGS. > >--nab > ________________________________ This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
<<attachment: winmail.dat>>