Re: [PATCH 0/4] iscsi target: Fix oops during relogin

Mike Christie <mchristi@xxxxxxxxxx> · Fri, 31 Mar 2017 11:54:10 -0500

On 03/31/2017 01:00 AM, Nicholas A. Bellinger wrote:
> Hey Mike,
> 
> On Tue, 2017-03-21 at 09:14 -0400, Mike Christie wrote:
>> On 02/26/2017 10:03 PM, Mike Christie wrote:
>>> The following patches fix a oops that occurs when the initiator
>>> is trying to relogin to a iscsi target. The problem occurs, when
>>> the initiator has sent a command that is stuck running on some
>>> backend, and the initiator has sent TMFs and eventually escalated
>>> to session level recovery.
>>>
>>> During the relogin operation, the target will wait for the stuck
>>> command to complete, and the initiator may time out the relogin
>>> request and drop that tcp connection and retry. The target will
>>> then free the iscsi connection structs from under the np_thread
>>> and we will crash.
>>>
>>> Patches were made over the target-pending for-next branch.
>>>
>>
>> Hey Nick,
>>
>> Were these patches ok?
>>
>> I can rebuild them against your current tree, but before I do that, I
>> was thinking there might be a cleaner alternative you know about.
>>
>> I think my patches are a little ugly. The behavior for the modified
>> functions is now more difficult to follow because you can sleep, not
>> sleep and now interruptible sleep, and you can end up retrying them and
>> going down different branches on the retry.
>>
>> I think the alternative is some sort ref counting based teardown in that
>> login error path.
> 
> I haven't forgot about these, and apologies on the extended delay for a
> proper follow-up.

He He, I was a bad initiator maintainer in the end so I can't complain :)

> 
> What I'm confused by is the particular scenario described in the patch.
> That is, it's the same scenario DATERA Q/A and automation routinely
> tests on v4.1.y and v3.14.y with a few thousand active volumes.  So far
> we've not triggered a reproduction like the one described above.
> 
> Namely, where a backend driver takes an extended amount of time to
> complete an outstanding se_cmd, resulting in ABORT_TASK and LUN_RESET,
> followed by a session reinstatement that occurs while se_cmd is still
> outstanding to backend driver code.
> 
> If a session reinstatement fails due to it's login attempt taking longer
> than TA_LOGIN_TIMEOUT=15 seconds since the se_cmd in question still
> didn't complete, iscsi_handle_login_thread_timeout() fires and sends
> SIGINT to iscsi_np->np_thread.
> 
> If iscsi_check_for_session_reinstatement() is already blocked on
> iscsi_stop_session() -> wait_for_completion(), it will wait indefinitely
> until the se_cmd in question is completed back to target-core before
> allowing login to make forward progress, or fail due to the login
> timeout.

This is where we hit the problem.

At this time while in the wait, the initiator gives up (normally hit a
iscsi login timeout on the initiator side) on the login attempt and just
drops the tcp/ip connection. On the target side we detect this and
iscsi_target_sk_state_change runs and iscsi_target_do_cleanup which
frees the iscsi_login related resources.

When the command eventually completes, we wake from the
wait_for_completion and try to access the freed iscsi_login struct.

The problem is that iscsi_check_for_session_reinstatement ->
iscsi_target_check_for_existing_instances will return 0 after the
command has completed so the login thread does not know that login has
failed due to the tcp/ip connection getting dropped and the iscsi_login
struct has been freed. It will then try to access the freed iscsi_login
struct and proceed with the login process.

> 
> If iscsi_check_for_session_reinstatement hasn't been reached yet or
> hasn't blocked on wait_for_completion(), the SIGINT should fail the
> connection the next time it attempts to do socket I/O.
> 
> From what I can gather from the original problem statement, you are
> hitting something different than these two cases, right..?
> 
> So I'd really like to reproduce what you've seen to trigger the
> scenario, and jump into kgdb and see what's going on.  Would you mind
> giving me more details wrt you've been have to reproduce this, and even
> better, some debug code to reproduce at will..?
> 

I will send a patch for scsi_debug that can simulate the problem.

--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html