Re: [PATCH 13/18] scsi: target: Fix multiple LUN_RESET handling

Mike Christie <michael.christie@xxxxxxxxxx> · Thu, 16 Mar 2023 11:03:53 -0500

On 3/16/23 5:39 AM, Dmitry Bogdanov wrote:
> On Wed, Mar 15, 2023 at 04:42:19PM -0500, Mike Christie wrote:
>> On 3/15/23 2:11 PM, Dmitry Bogdanov wrote:
>>> On Wed, Mar 15, 2023 at 11:44:48AM -0500, Mike Christie wrote:
>>>> On 3/15/23 11:13 AM, Dmitry Bogdanov wrote:
>>>>> On Thu, Mar 09, 2023 at 04:33:07PM -0600, Mike Christie wrote:
>>>>>> This fixes a bug where an initiator thinks a LUN_RESET has cleaned
>>>>>> up running commands when it hasn't. The bug was added in:
>>>>>>
>>>>>> commit 51ec502a3266 ("target: Delete tmr from list before processing")
>>>>>>
>>>>>> The problem occurs when:
>>>>>>
>>>>>> 1. We have N IO cmds running in the target layer spread over 2 sessions.
>>>>>> 2. The initiator sends a LUN_RESET for each session.
>>>>>> 3. session1's LUN_RESET loops over all the running commands from both
>>>>>> sessions and moves them to its local drain_task_list.
>>>>>> 4. session2's LUN_RESET does not see the LUN_RESET from session1 because
>>>>>> the commit above has it remove itself. session2 also does not see any
>>>>>> commands since the other reset moved them off the state lists.
>>>>>> 5. sessions2's LUN_RESET will then complete with a successful response.
>>>>>> 6. sessions2's inititor believes the running commands on its session are
>>>>>> now cleaned up due to the successful response and cleans up the running
>>>>>> commands from its side. It then restarts them.
>>>>>> 7. The commands do eventually complete on the backend and the target
>>>>>> starts to return aborted task statuses for them. The initiator will
>>>>>> either throw a invalid ITT error or might accidentally lookup a new task
>>>>>> if the ITT has been reallocated already.
>>>>>>
>>>>>> This fixes the bug by reverting the patch, and also serializes the
>>>>>> execution of LUN_RESETs and Preempt and Aborts. The latter is necessary
>>>>>> because it turns out the commit accidentally fixed a bug where if there
>>>>>> are 2 LUN RESETs executing they can see each other on the dev_tmr_list,
>>>>>> put the other one on their local drain list, then end up waiting on each
>>>>>> other resulting in a deadlock.
>>>>> If LUN_RESET is not in TMR list anymore there is no need to serialize
>>>>> core_tmr_drain_tmr_list.
>>>> Ah shoot yeah I miswrote that. I meant I needed the serialization for my
>>>> bug not yours.
>>> I still did not get why you wrapping core_tmr_drain_*_list by mutex.
>>> general_tmr_list have only aborts now and they do not wait for other aborts.
>> Do you mean I don't need the mutex for the bug I originally hit that's described
>> at the beginning? If your saying I don't need it for 2 resets running at the same
>> time, I agree. I thought I needed it if we have a RESET and Preempt and Abort:
>>
>> 1. You have 2 sessions. There are no TMRs initially.
>> 2. session1 gets Preempt and Abort. It calls core_tmr_drain_state_list
>> and takes all the cmds from both sessions and puts them on the local
>> drain_task_list list.
>> 3. session1 or 2 gets a LUN_RESET, it sees no cmds on the device's
>> state_lists, and returns success.
>> 4. The initiator thinks the commands were cleaned up by the LUN_RESET.
>>
>> - It could end up re-using the ITT while the original task being cleaned up is
>> still running. Then depending on which session got what and if TAS was set, if
>> the original command completes first then the initiator would think the second
>> command failed with SAM_STAT_TASK_ABORTED.
>>
>> - If there was no TAS or the RESET and Preempt and Abort were on the same session
>> then when we could still hit a bug. We get the RESET response, the initiator might
>> retry the cmds or fail and the app might retry. The retry might go down a completely
>> different path on the target (like if hw queue1 was blocked and had the original
>> command, but this retry goes down hw queue2 due to being received on a different
>> CPU, so it completes right away). We do some new IO. Then hw queue1 unblocks and
>> overwrites the new IO.
>>
>> With the mutex, the LUN_RESET will wait for the Preempt and Abort
>> which is waiting on the running commands. I could have had Preempt
>> and Abort create a tmr, and queue a work and go through that path
>> but I thought it looked uglier faking it.
> Thank you for explanation. But I think you a not right here.
> Preempt And Abort is used to change the reservation holder and abort
> preempted session's commands. A preempted session is not allowed to send
> any new messages, they will be failed anyway.

For the ITT bug, a preempted session can still send commands like INQUIRY,
TURS, RTPG, PR-in, etc. If those commands have the same ITT as the command
the Preempt and Abort is waiting on, we can hit the bug.

Also in general for the ITT bug, even if the new cmd was going to be failed
due to a conflict, it's not right. Eventually the command the Preempt and Abort
is waiting on completes. The initiator is going to end up logging a message
the user almost never sees about getting a command response but no running
command, and drop the connection, and bug people like us :)

For the second issue, if the LUN_RESET came after the Preempt and Abort on
the same session, the RESET doesn't clear the registrations and reservation
So it's going to be sending IO down that specific path, so they will be
executing.

Agree with you for the no TAS and RESET and Preempt on Abort running
on different sessions case. I was thinking the path that got preempted
could later get registered and start sending IO, but I don't think that
makes sense.