On 3/7/23 5:47 AM, Sagi Grimberg wrote: >> [ 220.131709] isert: isert_allocate_cmd: Unable to allocate iscsit_cmd + isert_cmd >> [ 220.131712] isert: isert_allocate_cmd: Unable to allocate iscsit_cmd + isert_cmd >> [ 280.862544] ABORT_TASK: Found referenced iSCSI task_tag: 70 >> [ 313.265156] iSCSI Login timeout on Network Portal 5.1.1.21:3260 >> [ 334.769268] INFO: task kworker/32:3:1285 blocked for more than 30 seconds. >> [ 334.769272] Tainted: G OE 6.2.0-rc3 #6 >> [ 334.769274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 334.769275] task:kworker/32:3 state:D stack:0 pid:1285 ppid:2 flags:0x00004000 >> [ 334.769279] Workqueue: events target_tmr_work [target_core_mod] >> [ 334.769307] Call Trace: >> [ 334.769308] <TASK> >> [ 334.769310] __schedule+0x318/0xa30 >> [ 334.769316] ? _prb_read_valid+0x22e/0x2b0 >> [ 334.769319] ? __pfx_schedule_timeout+0x10/0x10 >> [ 334.769322] ? __wait_for_common+0xd3/0x1e0 >> [ 334.769323] schedule+0x57/0xd0 >> [ 334.769325] schedule_timeout+0x273/0x320 >> [ 334.769327] ? __irq_work_queue_local+0x39/0x80 >> [ 334.769330] ? irq_work_queue+0x3f/0x60 >> [ 334.769332] ? __pfx_schedule_timeout+0x10/0x10 >> [ 334.769333] __wait_for_common+0xf9/0x1e0 >> [ 334.769335] target_put_cmd_and_wait+0x59/0x80 [target_core_mod] >> [ 334.769351] core_tmr_abort_task.cold.8+0x187/0x202 [target_core_mod] >> [ 334.769369] target_tmr_work+0xa1/0x110 [target_core_mod] >> [ 334.769384] process_one_work+0x1b0/0x390 >> [ 334.769387] worker_thread+0x40/0x380 >> [ 334.769389] ? __pfx_worker_thread+0x10/0x10 >> [ 334.769391] kthread+0xfa/0x120 >> [ 334.769393] ? __pfx_kthread+0x10/0x10 >> [ 334.769395] ret_from_fork+0x29/0x50 >> [ 334.769399] </TASK> >> [ 334.769442] INFO: task iscsi_np:5337 blocked for more than 30 seconds. >> [ 334.769444] Tainted: G OE 6.2.0-rc3 #6 >> [ 334.769444] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> [ 334.769445] task:iscsi_np state:D stack:0 pid:5337 ppid:2 flags:0x00004004 >> [ 334.769447] Call Trace: >> [ 334.769447] <TASK> >> [ 334.769448] __schedule+0x318/0xa30 >> [ 334.769451] ? __pfx_schedule_timeout+0x10/0x10 >> [ 334.769453] ? __wait_for_common+0xd3/0x1e0 >> [ 334.769454] schedule+0x57/0xd0 >> [ 334.769456] schedule_timeout+0x273/0x320 >> [ 334.769459] ? iscsi_update_param_value+0x27/0x70 [iscsi_target_mod] >> [ 334.769476] ? __kmalloc_node_track_caller+0x52/0x130 >> [ 334.769478] ? __pfx_schedule_timeout+0x10/0x10 >> [ 334.769480] __wait_for_common+0xf9/0x1e0 >> [ 334.769481] iscsi_check_for_session_reinstatement+0x1e8/0x280 [iscsi_target_mod] The hang here might be this issue: https://lore.kernel.org/linux-scsi/c1a395a3-74e2-c77f-c8e6-1cade30dfac6@xxxxxxxxxx/T/#mdb29702f7c345eb7e3631d58e3ac7fac26e15fee That version had some bugs, so I'm working on a new version. >> [ 334.769496] iscsi_target_do_login+0x23b/0x570 [iscsi_target_mod] >> [ 334.769508] iscsi_target_start_negotiation+0x55/0xc0 [iscsi_target_mod] >> [ 334.769519] iscsi_target_login_thread+0x675/0xeb0 [iscsi_target_mod] >> [ 334.769531] ? __pfx_iscsi_target_login_thread+0x10/0x10 [iscsi_target_mod] >> [ 334.769541] kthread+0xfa/0x120 >> [ 334.769543] ? __pfx_kthread+0x10/0x10 >> [ 334.769544] ret_from_fork+0x29/0x50 >> [ 334.769547] </TASK> >> >> >> [ 185.734571] isert: isert_allocate_cmd: Unable to allocate iscsit_cmd + isert_cmd >> [ 246.032360] ABORT_TASK: Found referenced iSCSI task_tag: 75 Or, if there is only one session, then LIO might be waiting for commands to complete before allowing a new login. Or, it could be a combo of both. >> [ 278.442726] iSCSI Login timeout on Network Portal 5.1.1.21:3260 >> >> >> By the way increasing tag_num in iscsi_target_locate_portal() will also avoid the issue" >> >> Any thoughts on what could be causing this hang? > > I know that Mike just did a set of fixes on the session teardown area... > Perhaps you should try with the patchset "target: TMF and recovery > fixes" applied?