Re: [PATCH 13/18] scsi: target: Fix multiple LUN_RESET handling

Mike Christie <michael.christie@xxxxxxxxxx> · Wed, 15 Mar 2023 11:44:48 -0500

On 3/15/23 11:13 AM, Dmitry Bogdanov wrote:
> On Thu, Mar 09, 2023 at 04:33:07PM -0600, Mike Christie wrote:
>> «Внимание! Данное письмо от внешнего адресата!»
>>
>> This fixes a bug where an initiator thinks a LUN_RESET has cleaned
>> up running commands when it hasn't. The bug was added in:
>>
>> commit 51ec502a3266 ("target: Delete tmr from list before processing")
>>
>> The problem occurs when:
>>
>> 1. We have N IO cmds running in the target layer spread over 2 sessions.
>> 2. The initiator sends a LUN_RESET for each session.
>> 3. session1's LUN_RESET loops over all the running commands from both
>> sessions and moves them to its local drain_task_list.
>> 4. session2's LUN_RESET does not see the LUN_RESET from session1 because
>> the commit above has it remove itself. session2 also does not see any
>> commands since the other reset moved them off the state lists.
>> 5. sessions2's LUN_RESET will then complete with a successful response.
>> 6. sessions2's inititor believes the running commands on its session are
>> now cleaned up due to the successful response and cleans up the running
>> commands from its side. It then restarts them.
>> 7. The commands do eventually complete on the backend and the target
>> starts to return aborted task statuses for them. The initiator will
>> either throw a invalid ITT error or might accidentally lookup a new task
>> if the ITT has been reallocated already.
>>
>> This fixes the bug by reverting the patch, and also serializes the
>> execution of LUN_RESETs and Preempt and Aborts. The latter is necessary
>> because it turns out the commit accidentally fixed a bug where if there
>> are 2 LUN RESETs executing they can see each other on the dev_tmr_list,
>> put the other one on their local drain list, then end up waiting on each
>> other resulting in a deadlock.
> 
> If LUN_RESET is not in TMR list anymore there is no need to serialize
> core_tmr_drain_tmr_list.

Ah shoot yeah I miswrote that. I meant I needed the serialization for my
bug not yours.

>>
>>         if (cmd->transport_state & CMD_T_ABORTED)
>> @@ -3596,6 +3597,22 @@ static void target_tmr_work(struct work_struct *work)
>>                         target_dev_ua_allocate(dev, 0x29,
>>                                                ASCQ_29H_BUS_DEVICE_RESET_FUNCTION_OCCURRED);
>>                 }
>> +
>> +               /*
>> +                * If this is the last reset the device can be freed after we
>> +                * run transport_cmd_check_stop_to_fabric. Figure out if there
>> +                * are other resets that need to be scheduled while we know we
>> +                * have a refcount on the device.
>> +                */
>> +               spin_lock_irq(&dev->se_tmr_lock);
> 
> tmr->tmr_list is removed from the list in the very end of se_cmd lifecycle
> so any number of LUN_RESETs can be in lun_reset_tmr_list. And all of them 
> can be finished but not yet removed from the list. 

Don't we remove it from the list a little later in this function when
we call transport_lun_remove_cmd?

>  
> You may delete lun_reset here with nulling tmr->tmr_dev:
> +			list_del_init(&cmd->se_tmr_req->tmr_list);
> +			cmd->se_tmr_req->tmr_dev = NULL;
> 
> Then the check below will be just 
> +			if (!list_empty(dev->lun_reset_tmr_list))

I could go either way on this. Normally it's best to just have the one
place where we handle something like the deletion and clearing. If I'm
correct then it's already done a little later in this function so we
are ok.

On the other hand, yeah my test is kind of gross.

>>
>> +       spin_lock_irqsave(&dev->se_tmr_lock, flags);
>> +       if (cmd->se_tmr_req->function == TMR_LUN_RESET) {
>> +               /*
>> +                * We only allow one reset to execute at a time to prevent
>> +                * one reset waiting on another, and to make sure one reset
>> +                * does not claim all the cmds causing the other reset to
>> +                * return early.
>> +                */
>> +               if (dev->dev_flags & DF_RESETTING_LUN) {
>> +                       spin_unlock_irqrestore(&dev->se_tmr_lock, flags);
>> +                       goto done;
>> +               }
>> +
>> +               dev->dev_flags |= DF_RESETTING_LUN;
> 
> Not good choise of flag variable. It is used at configuration time and
> not under a lock. Configfs file dev/alias can be changed in any time
> and could race with LUN_RESET.

I didn't see any places where one place can overwrite other flags. Are
you just saying in general it could happen. If so, would you also not
want dev->transport_flags to be used then?