Re: [PATCH v3] target: core: remove from tmr_list at lun unlink

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/20/21 11:40 AM, Dmitriy Bogdanov wrote:
> Hi Mike,
> 
>>> @@ -234,6 +225,7 @@ static void core_tmr_drain_tmr_list(
>>>  		}
>>>  
>>>  		list_move_tail(&tmr_p->tmr_list, &drain_tmr_list);
>>> +		tmr_p->tmr_dev = NULL;
>>
>> Is this patch now adding a way to hit:
>>
>> if (!tmr->tmr_dev)
>> 	WARN_ON_ONCE(transport_lookup_tmr_lun(tmr->task_cmd) < 0);                      
>>
>> in core_tmr_abort_task?
>>
>> You have the abort and lun reset works running on different CPUs.
>> The lun reset hits the above code first and clears tmr_dev.
>> The abort then hits the tmr->tmr_dev check and tries to do
>> transport_lookup_tmr_lun.
>>
>> For the case where the lun is not removed, it looks like
>> transport_lookup_tmr_lun will add the tmr to the dev_tmr_list
>> but it would also be on the drain_tmr_list above so we would
>> hit list corruption.
> 
> Yes, there is a such race. I think,  I can solve it by changing the order of
> draining the tmr_list and state_list at LUN Reset to make the raced lines 
> be under the same lock.
> 
> Especially SAM-5 describes(but does not require) aborting commands
> before tmfs:
> | When responding to a logical unit reset condition, the logical unit shall:
> |	a) abort all commands as described in 5.6;
> |	b) abort all copy operations (see SPC-4);
> |	c) terminate all task management functions;
> 
> 
>> For the case where the lun is getting removed, percpu_ref_tryget_live
>> would fail in transport_lookup_tmr_lun and we hit the WARN_ON_ONCE.
>> I think though with your patch, we would be ok and don't want
>> the WARN_ON_ONCE, right? The lun reset would just wait for the
>> abort. When it completes the abort and reset complete as expected.
> 
> I don’t understand the meaning of that transport_lookup_tmr_lun there.
> Every TMF Abort has already executed transport_lookup_tmr_lun at the very
> beginning of its handling. 

Yeah, I think it's not needed. It came in with:

commit 2c9fa49e100f962af988f1c0529231bf14905cda
Author: Bart Van Assche <bvanassche@xxxxxxx>
Date:   Tue Nov 27 15:52:03 2018 -0800

    scsi: target/core: Make ABORT and LUN RESET handling synchronous
gree. It looks like it was added in:

and in that patch I can't see it ever happening. We have 2 ways to submit
an abort tmr:

1. target_submit_tmr - Calls transport_lookup_tmr_lun then
transport_generic_handle_tmr.

2. iscsit_handle_task_mgt_cmd - Will call transport_lookup_tmr_lun
for every tmr except the iscsi specific TASK REASSIGN. TASK REASSING
is not passed to transport_generic_handle_tmr.

I don't see any places where tmr_dev is NULL after transport_lookup_tmr_lun
has set it and added it to the list.

So I think you can just kill it.



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux