> Subject: Re: [PATCH for-rc] IB/isert: Fix hang in iscsit_wait_for_tag > > > > On 1/30/23 20:22, Devale, Sindhu wrote: > > > > > >> Subject: Re: [PATCH for-rc] IB/isert: Fix hang in iscsit_wait_for_tag > >> > >> > >>> From: Mustafa Ismail <mustafa.ismail@xxxxxxxxx> > >>> > >>> Running fio can occasionally cause a hang when sbitmap_queue_get() > >>> fails to return a tag in iscsit_allocate_cmd() and > >>> iscsit_wait_for_tag() is called and will never return from the > >>> schedule(). This is because the polling thread of the CQ is > >>> suspended, and will not poll for a SQ completion which would free up a tag. > >>> Fix this by creating a separate CQ for the SQ so that send > >>> completions are processed on a separate thread and are not blocked > >>> when the RQ CQ is stalled. > >>> > >>> Fixes: 10e9cbb6b531 ("scsi: target: Convert target drivers to use > >>> sbitmap") > >> > >> Is this the real offending commit? What prevented this from happening > >> before? > > > > Maybe going to a global bitmap instead of per cpu ida makes it less likely to > occur. > > Going to single CQ maybe the real root cause in this > > commit:6f0fae3d7797("iser-target: Use single CQ for TX and RX") > > Yes this is more likely. > > > > >>> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx> > >>> Signed-off-by: Mustafa Ismail <mustafa.ismail@xxxxxxxxx> > >>> Signed-off-by: Shiraz Saleem <shiraz.saleem@xxxxxxxxx> > >>> --- > >>> drivers/infiniband/ulp/isert/ib_isert.c | 33 > >>> +++++++++++++++++++++++-- > >> -------- > >>> drivers/infiniband/ulp/isert/ib_isert.h | 3 ++- > >>> 2 files changed, 25 insertions(+), 11 deletions(-) > >>> > >>> diff --git a/drivers/infiniband/ulp/isert/ib_isert.c > >>> b/drivers/infiniband/ulp/isert/ib_isert.c > >>> index 7540488..f827b91 100644 > >>> --- a/drivers/infiniband/ulp/isert/ib_isert.c > >>> +++ b/drivers/infiniband/ulp/isert/ib_isert.c > >>> @@ -109,19 +109,27 @@ static int isert_sg_tablesize_set(const char > >>> *val, > >> const struct kernel_param *kp > >>> struct ib_qp_init_attr attr; > >>> int ret, factor; > >>> > >>> - isert_conn->cq = ib_cq_pool_get(ib_dev, cq_size, -1, > >> IB_POLL_WORKQUEUE); > >>> - if (IS_ERR(isert_conn->cq)) { > >>> - isert_err("Unable to allocate cq\n"); > >>> - ret = PTR_ERR(isert_conn->cq); > >>> + isert_conn->snd_cq = ib_cq_pool_get(ib_dev, cq_size, -1, > >>> + IB_POLL_WORKQUEUE); > >>> + if (IS_ERR(isert_conn->snd_cq)) { > >>> + isert_err("Unable to allocate send cq\n"); > >>> + ret = PTR_ERR(isert_conn->snd_cq); > >>> return ERR_PTR(ret); > >>> } > >>> + isert_conn->rcv_cq = ib_cq_pool_get(ib_dev, cq_size, -1, > >>> + IB_POLL_WORKQUEUE); > >>> + if (IS_ERR(isert_conn->rcv_cq)) { > >>> + isert_err("Unable to allocate receive cq\n"); > >>> + ret = PTR_ERR(isert_conn->rcv_cq); > >>> + goto create_cq_err; > >>> + } > >> > >> Does this have any noticeable performance implications? > > > > Initial testing seems to indicate this change causes significant performance > variability specifically only with 2K Writes. > > We suspect that may be due an unfortunate vector placement where the > snd_cq and rcv_cq are on different numa nodes. > > We can, in the patch, alter the second CQ creation to pass comp_vector to > insure they are hinted to the same affinity. > > Even so, still there are now two competing threads for completion processing. > > > > >> Also I wander if there are any other assumptions in the code for > >> having a single context processing completions... > > > > We don't see any. > > > >> It'd be much easier if iscsi_allocate_cmd could accept a timeout to fail... > >> > >> CCing target-devel and Mike. > > > > Do you mean add a timeout to the wait or removing the call > to iscsit_wait_for_tag() iscsit_allocate_cmd()? > > Looking at the code, passing it TASK_RUNNING will make it fail if there no > available tag (and hence drop the received command, let the initiator retry). But I > also think that isert may need a deeper default queue depth... Hi Sagi - Mustafa reports - "The problem is not easily reproduced, so I reduce the amount of map_tags allocated when I testing a potential fix. Passing TASK_RUNNING and I got the following call trace: [ 220.131709] isert: isert_allocate_cmd: Unable to allocate iscsit_cmd + isert_cmd [ 220.131712] isert: isert_allocate_cmd: Unable to allocate iscsit_cmd + isert_cmd [ 280.862544] ABORT_TASK: Found referenced iSCSI task_tag: 70 [ 313.265156] iSCSI Login timeout on Network Portal 5.1.1.21:3260 [ 334.769268] INFO: task kworker/32:3:1285 blocked for more than 30 seconds. [ 334.769272] Tainted: G OE 6.2.0-rc3 #6 [ 334.769274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 334.769275] task:kworker/32:3 state:D stack:0 pid:1285 ppid:2 flags:0x00004000 [ 334.769279] Workqueue: events target_tmr_work [target_core_mod] [ 334.769307] Call Trace: [ 334.769308] <TASK> [ 334.769310] __schedule+0x318/0xa30 [ 334.769316] ? _prb_read_valid+0x22e/0x2b0 [ 334.769319] ? __pfx_schedule_timeout+0x10/0x10 [ 334.769322] ? __wait_for_common+0xd3/0x1e0 [ 334.769323] schedule+0x57/0xd0 [ 334.769325] schedule_timeout+0x273/0x320 [ 334.769327] ? __irq_work_queue_local+0x39/0x80 [ 334.769330] ? irq_work_queue+0x3f/0x60 [ 334.769332] ? __pfx_schedule_timeout+0x10/0x10 [ 334.769333] __wait_for_common+0xf9/0x1e0 [ 334.769335] target_put_cmd_and_wait+0x59/0x80 [target_core_mod] [ 334.769351] core_tmr_abort_task.cold.8+0x187/0x202 [target_core_mod] [ 334.769369] target_tmr_work+0xa1/0x110 [target_core_mod] [ 334.769384] process_one_work+0x1b0/0x390 [ 334.769387] worker_thread+0x40/0x380 [ 334.769389] ? __pfx_worker_thread+0x10/0x10 [ 334.769391] kthread+0xfa/0x120 [ 334.769393] ? __pfx_kthread+0x10/0x10 [ 334.769395] ret_from_fork+0x29/0x50 [ 334.769399] </TASK> [ 334.769442] INFO: task iscsi_np:5337 blocked for more than 30 seconds. [ 334.769444] Tainted: G OE 6.2.0-rc3 #6 [ 334.769444] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 334.769445] task:iscsi_np state:D stack:0 pid:5337 ppid:2 flags:0x00004004 [ 334.769447] Call Trace: [ 334.769447] <TASK> [ 334.769448] __schedule+0x318/0xa30 [ 334.769451] ? __pfx_schedule_timeout+0x10/0x10 [ 334.769453] ? __wait_for_common+0xd3/0x1e0 [ 334.769454] schedule+0x57/0xd0 [ 334.769456] schedule_timeout+0x273/0x320 [ 334.769459] ? iscsi_update_param_value+0x27/0x70 [iscsi_target_mod] [ 334.769476] ? __kmalloc_node_track_caller+0x52/0x130 [ 334.769478] ? __pfx_schedule_timeout+0x10/0x10 [ 334.769480] __wait_for_common+0xf9/0x1e0 [ 334.769481] iscsi_check_for_session_reinstatement+0x1e8/0x280 [iscsi_target_mod] [ 334.769496] iscsi_target_do_login+0x23b/0x570 [iscsi_target_mod] [ 334.769508] iscsi_target_start_negotiation+0x55/0xc0 [iscsi_target_mod] [ 334.769519] iscsi_target_login_thread+0x675/0xeb0 [iscsi_target_mod] [ 334.769531] ? __pfx_iscsi_target_login_thread+0x10/0x10 [iscsi_target_mod] [ 334.769541] kthread+0xfa/0x120 [ 334.769543] ? __pfx_kthread+0x10/0x10 [ 334.769544] ret_from_fork+0x29/0x50 [ 334.769547] </TASK> [ 185.734571] isert: isert_allocate_cmd: Unable to allocate iscsit_cmd + isert_cmd [ 246.032360] ABORT_TASK: Found referenced iSCSI task_tag: 75 [ 278.442726] iSCSI Login timeout on Network Portal 5.1.1.21:3260 By the way increasing tag_num in iscsi_target_locate_portal() will also avoid the issue" Any thoughts on what could be causing this hang? Shiraz