Re: [PATCH] iscsi-target: fix hang in iscsit_access_np() when getting tpg->np_login_sem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jul 29, 2020, at 8:03 AM, Hou Pu <houpu@xxxxxxxxxxxxx> wrote:
> 
> The iscsi target login thread might stuck in following stack:
> 
> cat /proc/`pidof iscsi_np`/stack
> [<0>] down_interruptible+0x42/0x50
> [<0>] iscsit_access_np+0xe3/0x167
> [<0>] iscsi_target_locate_portal+0x695/0x8ac
> [<0>] __iscsi_target_login_thread+0x855/0xb82
> [<0>] iscsi_target_login_thread+0x2f/0x5a
> [<0>] kthread+0xfa/0x130
> [<0>] ret_from_fork+0x1f/0x30
> 
> This could be reproduced by following steps:
> 1. Initiator A try to login iqn1-tpg1 on port 3260. After finishing
>   PDU exchange in the login thread and before the negotiation is
>   finished, at this time the network link is down. In a production
>   environment, this could happen. I could emulated it by bring
>   the network card down in the initiator node by ifconfig eth0 down.
>   (Now A could never finish this login. And tpg->np_login_sem is
>   hold by it).
> 2. Initiator B try to login iqn2-tpg1 on port 3260. After finishing
>   PDU exchange in the login thread. The target expect to process
>   remaining login PDUs in workqueue context.
> 3. Initiator A' try to re-login to iqn1-tpg1 on port 3260 from
>   a new socket. It will wait for tpg->np_login_sem with
>   np->np_login_timer loaded to wait for at most 15 second.
>   (Because the lock is held by A. A never gets a change to
>   release tpg->np_login_sem. so A' should finally get timeout).
> 4. Before A' got timeout. Initiator B gets negotiation failed and
>   calls iscsi_target_login_drop()->iscsi_target_login_sess_out().
>   The np->np_login_timer is canceled. And initiator A' will hang
>   there forever. Because A' is now in the login thread. All other
>   login requests could not be serviced.

iqn1 and iqn1 are different targets right? It’s not clear to me how when initiator B fails negotiation that it cancels the timer for the portal under a different iqn/target.

Is iqn2-tpg1->np1 a different struct than iqn1-tpg1-np1? I mean iscsit_get_tpg_from_np would return a different np struct for initiator B and for A?



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux