On 6/1/2020 4:46 PM, Krishnamraju Eraparaju wrote:
Hi, I observe, iscsit_allocate_cmd() process waiting indefinitely for 'iscsit tag' to become available. Here is the scenario: - at initiator, run "iozone -i 0 -I -+d -s 100000 -r 16384 -w" in an infinite loop. - after few seconds, target fails to obtain 'tag' from sbitmap_queue_get(), while processing ISCSI_CTRL PDU via isert_recv_done(). - then iscsit waits there for tag to become available, by calling schedule() in iscsit_wait_for_tag() - and the process never scheduled back again, not sure why.
can you increase the debug level on the target ? and share the logs in initiator side as well ? Seems that there is some reconnection that caused by io timeout. Also, is the drain_qp function of the iw_cxgb4 finished successfully ?
Due to this blockage the whole completion logic at iSER target went to grinding halt, causing NOPOUT request timeouts at initator, and below hung traces at target. Is this a known issue? Target: [May 7 20:30] iSCSI Login timeout on Network Portal 102.1.1.6:3260 [May 7 20:33] kworker/dying (2185) used greatest stack depth: 11672 bytes left [ +2.640115] kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [ +26.031050] INFO: task iscsi_np:8387 blocked for more than 122 seconds. [ +0.000004] Not tainted 5.7.0-rc2+ #2 [ +0.000001] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ +0.000001] iscsi_np D14304 8387 2 0x80004084 [ +0.000005] Call Trace: [ +0.000008] ? __schedule+0x274/0x5e0 [ +0.000003] ? stack_trace_save+0x46/0x70 [ +0.000002] schedule+0x45/0xb0 [ +0.000002] schedule_timeout+0x1d6/0x290 [ +0.000001] wait_for_completion+0x82/0xe0 [ +0.000007] iscsi_check_for_session_reinstatement+0x205/0x260 [iscsi_target_mod] [ +0.000004] iscsi_target_do_login+0xe3/0x5c0 [iscsi_target_mod] [ +0.000004] iscsi_target_start_negotiation+0x4c/0xa0 [iscsi_target_mod] [ +0.000004] iscsi_target_login_thread+0x86f/0x1000 [iscsi_target_mod] [ +0.000003] kthread+0xf3/0x130 [ +0.000004] ? iscsi_target_login_sess_out+0x1f0/0x1f0 [iscsi_target_mod] [ +0.000001] ? kthread_bind+0x10/0x10 [ +0.000002] ret_from_fork+0x35/0x40 [ +0.000003] INFO: task iscsi_ttx:8863 blocked for more than 122 seconds. [ +0.000001] Not tainted 5.7.0-rc2+ #2 [ +0.000000] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ +0.000001] iscsi_ttx D14384 8863 2 0x80004084 [ +0.000003] Call Trace: [ +0.000002] ? __schedule+0x274/0x5e0 [ +0.000004] ? c4iw_ib_modify_qp+0xf3/0x160 [iw_cxgb4] [ +0.000002] schedule+0x45/0xb0 [ +0.000002] schedule_timeout+0x1d6/0x290 [ +0.000001] wait_for_completion+0x82/0xe0 [ +0.000003] __ib_drain_sq+0x147/0x170 [ +0.000002] ? ib_sg_to_pages+0x1a0/0x1a0 [ +0.000003] ib_drain_sq+0x6e/0x80 [ +0.000002] ib_drain_qp+0x9/0x20 [ +0.000002] isert_wait_conn+0x51/0x2c0 [ib_isert] [ +0.000004] iscsit_close_connection+0x14c/0x840 [iscsi_target_mod] [ +0.000002] ? __schedule+0x27c/0x5e0 [ +0.000004] iscsit_take_action_for_connection_exit+0x79/0x100 [iscsi_target_mod] [ +0.000003] iscsi_target_tx_thread+0x160/0x200 [iscsi_target_mod] [ +0.000004] ? wait_woken+0x80/0x80 [ +0.000002] kthread+0xf3/0x130 [ +0.000003] ? iscsit_thread_get_cpumask+0x80/0x80 [iscsi_target_mod] [ +0.000001] ? kthread_bind+0x10/0x10 [ +0.000001] ret_from_fork+0x35/0x40 [ +0.000003] NMI backtrace for cpu 7 [ +0.000002] CPU: 7 PID: 493 Comm: khungtaskd Not tainted 5.7.0-rc2+ #2 [ +0.000001] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1a 10/24/2018 [ +0.000001] Call Trace: [ +0.000003] dump_stack+0x50/0x6b [ +0.000002] nmi_cpu_backtrace+0x9e/0xb0 [ +0.000003] ? lapic_can_unplug_cpu+0x90/0x90 [ +0.000002] nmi_trigger_cpumask_backtrace+0x9c/0xf0 [ +0.000003] watchdog+0x310/0x4f0 [ +0.000002] kthread+0xf3/0x130 [ +0.000002] ? hungtask_pm_notify+0x40/0x40 [ +0.000001] ? kthread_bind+0x10/0x10 [ +0.000001] ret_from_fork+0x35/0x40 [ +0.000002] Sending NMI from CPU 7 to CPUs 0-6: [ +0.000009] NMI backtrace for cpu 0 skipped: idling at acpi_processor_ffh_cstate_enter+0x6f/0xc0 [ +0.000001] NMI backtrace for cpu 4 skipped: idling at acpi_processor_ffh_cstate_enter+0x6f/0xc0 [ +0.000002] NMI backtrace for cpu 1 skipped: idling at acpi_processor_ffh_cstate_enter+0x6f/0xc0 [ +0.000019] NMI backtrace for cpu 2 skipped: idling at acpi_processor_ffh_cstate_enter+0x6f/0xc0 [ +0.000001] NMI backtrace for cpu 6 skipped: idling at acpi_processor_ffh_cstate_enter+0x6f/0xc0 [ +0.000010] NMI backtrace for cpu 5 [ +0.000000] CPU: 5 PID: 4465 Comm: kworker/5:1H Not tainted 5.7.0-rc2+ #2 [ +0.000000] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.1a 10/24/2018 [ +0.000001] Workqueue: ib-comp-wq ib_cq_poll_work [ +0.000000] RIP: 0010:__free_pages+0xe/0x60 Initiator: [ 2002.689250] connection5:0: ping timeout of 5 secs expired, recv timeout 5, last rx 4296659424, last ping 4296664448, now 4296669696 [ 2002.689275] connection5:0: detected conn error (1022) Thanks, Krishna.