deadlocks when the target server runs as initiator to itself

Maurizio Lombardi <mlombard@xxxxxxxxxx> · Fri, 27 Jan 2023 10:58:33 +0100

Hello Mike, Dmitry,

A customer of ours needs an unusual configuration where an iSCSI initiator
runs on the same host of the target;
in other words, the host sees an iSCSI disk which is in fact just a local disk.

The problem is that under heavy load sometimes the system hangs,
example of backtrace:

    crash> bt 2037117
    PID: 2037117  TASK: ffff8bb4c901dac0  CPU: 0    COMMAND: "iscsi_trx"
     #0 [ffffa3f4199db378] __schedule at ffffffff9134b2ed
     #1 [ffffa3f4199db408] schedule at ffffffff9134b7c8
     #2 [ffffa3f4199db418] io_schedule at ffffffff9134bbe2
     #3 [ffffa3f4199db428] rq_qos_wait at ffffffff90e61245
     #4 [ffffa3f4199db4b0] wbt_wait at ffffffff90e7bb99
     #5 [ffffa3f4199db4f0] __rq_qos_throttle at ffffffff90e60fc3
     #6 [ffffa3f4199db508] blk_mq_make_request at ffffffff90e5159d
     #7 [ffffa3f4199db598] generic_make_request at ffffffff90e4592f
     #8 [ffffa3f4199db600] submit_bio at ffffffff90e45bcc
     #9 [ffffa3f4199db640] xlog_state_release_iclog at ffffffffc0358cae [xfs]
    #10 [ffffa3f4199db668] __xfs_log_force_lsn at ffffffffc0359059 [xfs]
    #11 [ffffa3f4199db6d8] xfs_log_force_lsn at ffffffffc035a21f [xfs]
    #12 [ffffa3f4199db710] __xfs_iunpin_wait at ffffffffc03454e6 [xfs]
    #13 [ffffa3f4199db780] xfs_reclaim_inode at ffffffffc033c203 [xfs]
    #14 [ffffa3f4199db7c8] xfs_reclaim_inodes_ag at ffffffffc033c620 [xfs]
    #15 [ffffa3f4199db948] xfs_reclaim_inodes_nr at ffffffffc033d851 [xfs]
    #16 [ffffa3f4199db960] super_cache_scan at ffffffff90d1cad2
    #17 [ffffa3f4199db9b0] do_shrink_slab at ffffffff90c73e9c
    #18 [ffffa3f4199dba20] shrink_slab at ffffffff90c74761
    #19 [ffffa3f4199dbaa0] shrink_node at ffffffff90c7908c
    #20 [ffffa3f4199dbb20] do_try_to_free_pages at ffffffff90c79659
    #21 [ffffa3f4199dbb70] try_to_free_pages at ffffffff90c79a5f
    #22 [ffffa3f4199dbc10] __alloc_pages_slowpath at ffffffff90cbcd31
    #23 [ffffa3f4199dbd08] __alloc_pages_nodemask at ffffffff90cbd953
    #24 [ffffa3f4199dbd68] sgl_alloc_order at ffffffff90e80e08
    #25 [ffffa3f4199dbdb8] transport_generic_new_cmd at
ffffffffc0972ce5 [target_core_mod]
    #26 [ffffa3f4199dbdf8] iscsit_process_scsi_cmd at ffffffffc09eabf5
[iscsi_target_mod]
    #27 [ffffa3f4199dbe18] iscsit_get_rx_pdu at ffffffffc09ec239
[iscsi_target_mod]
    #28 [ffffa3f4199dbed8] iscsi_target_rx_thread at ffffffffc09eda61
[iscsi_target_mod]
    #29 [ffffa3f4199dbf10] kthread at ffffffff90b036c6

This is what I think it may happen:

The rx thread receives an iscsi command, calls sgl_alloc() but the
kernel needs to reclaim memory to satisfy the allocation; the memory
reclaim code starts a flush against the filesystem mounted on top of
the iscsi device, this ends up in a deadlock because the filesystem
needs the
target driver to complete the task, but the iscsi_rx thread is stuck
in sgl_alloc().

Sounds correct to you?

What do you think about using memalloc_noio_*() in the iscsi_rx thread
to prevent the memory reclaim code from starting I/O operations? Any
alternative ideas?

Thanks!
Maurizio