Re: deadlocks when the target server runs as initiator to itself

Mike Christie <michael.christie@xxxxxxxxxx> · Sun, 29 Jan 2023 18:09:07 -0600

On 1/27/23 03:58, Maurizio Lombardi wrote:
> Hello Mike, Dmitry,
> 
> A customer of ours needs an unusual configuration where an iSCSI initiator
> runs on the same host of the target;
> in other words, the host sees an iSCSI disk which is in fact just a local disk.
> 
> The problem is that under heavy load sometimes the system hangs,
> example of backtrace:
> 
>     crash> bt 2037117
>     PID: 2037117  TASK: ffff8bb4c901dac0  CPU: 0    COMMAND: "iscsi_trx"
>      #0 [ffffa3f4199db378] __schedule at ffffffff9134b2ed
>      #1 [ffffa3f4199db408] schedule at ffffffff9134b7c8
>      #2 [ffffa3f4199db418] io_schedule at ffffffff9134bbe2
>      #3 [ffffa3f4199db428] rq_qos_wait at ffffffff90e61245
>      #4 [ffffa3f4199db4b0] wbt_wait at ffffffff90e7bb99
>      #5 [ffffa3f4199db4f0] __rq_qos_throttle at ffffffff90e60fc3
>      #6 [ffffa3f4199db508] blk_mq_make_request at ffffffff90e5159d
>      #7 [ffffa3f4199db598] generic_make_request at ffffffff90e4592f
>      #8 [ffffa3f4199db600] submit_bio at ffffffff90e45bcc
>      #9 [ffffa3f4199db640] xlog_state_release_iclog at ffffffffc0358cae [xfs]
>     #10 [ffffa3f4199db668] __xfs_log_force_lsn at ffffffffc0359059 [xfs]
>     #11 [ffffa3f4199db6d8] xfs_log_force_lsn at ffffffffc035a21f [xfs]
>     #12 [ffffa3f4199db710] __xfs_iunpin_wait at ffffffffc03454e6 [xfs]
>     #13 [ffffa3f4199db780] xfs_reclaim_inode at ffffffffc033c203 [xfs]
>     #14 [ffffa3f4199db7c8] xfs_reclaim_inodes_ag at ffffffffc033c620 [xfs]
>     #15 [ffffa3f4199db948] xfs_reclaim_inodes_nr at ffffffffc033d851 [xfs]
>     #16 [ffffa3f4199db960] super_cache_scan at ffffffff90d1cad2
>     #17 [ffffa3f4199db9b0] do_shrink_slab at ffffffff90c73e9c
>     #18 [ffffa3f4199dba20] shrink_slab at ffffffff90c74761
>     #19 [ffffa3f4199dbaa0] shrink_node at ffffffff90c7908c
>     #20 [ffffa3f4199dbb20] do_try_to_free_pages at ffffffff90c79659
>     #21 [ffffa3f4199dbb70] try_to_free_pages at ffffffff90c79a5f
>     #22 [ffffa3f4199dbc10] __alloc_pages_slowpath at ffffffff90cbcd31
>     #23 [ffffa3f4199dbd08] __alloc_pages_nodemask at ffffffff90cbd953
>     #24 [ffffa3f4199dbd68] sgl_alloc_order at ffffffff90e80e08
>     #25 [ffffa3f4199dbdb8] transport_generic_new_cmd at
> ffffffffc0972ce5 [target_core_mod]
>     #26 [ffffa3f4199dbdf8] iscsit_process_scsi_cmd at ffffffffc09eabf5
> [iscsi_target_mod]
>     #27 [ffffa3f4199dbe18] iscsit_get_rx_pdu at ffffffffc09ec239
> [iscsi_target_mod]
>     #28 [ffffa3f4199dbed8] iscsi_target_rx_thread at ffffffffc09eda61
> [iscsi_target_mod]
>     #29 [ffffa3f4199dbf10] kthread at ffffffff90b036c6
> 
> This is what I think it may happen:
> 
> The rx thread receives an iscsi command, calls sgl_alloc() but the
> kernel needs to reclaim memory to satisfy the allocation; the memory
> reclaim code starts a flush against the filesystem mounted on top of
> the iscsi device, this ends up in a deadlock because the filesystem
> needs the
> target driver to complete the task, but the iscsi_rx thread is stuck
> in sgl_alloc().
> 
> Sounds correct to you?

Yeah, I think nbd and rbd have similar issues. I think they just say don't
do that.

> 
> What do you think about using memalloc_noio_*() in the iscsi_rx thread
> to prevent the memory reclaim code from starting I/O operations? Any
> alternative ideas?

I don't think that's the best option because it's a rare use case and it
will affect other users. Why can't the user just use tcm loop for the local
use case?