Re: [PATCH 4/4] scsi: Stop accepting SCSI requests before removing a device

Mike Christie <michaelc@xxxxxxxxxxx> · Wed, 06 Jun 2012 09:12:26 -0500



On 06/06/2012 08:43 AM, Mike Christie wrote:
> On 06/06/2012 07:25 AM, Bart Van Assche wrote:
>> On 06/05/12 22:08, Mike Christie wrote:
>>
>>> On 06/05/2012 12:14 PM, Bart Van Assche wrote:
>>>> Avoid that the code for requeueing SCSI requests triggers a
>>>> crash by making sure that that code isn't scheduled anymore
>>>> after a device has been removed.
>>>>
>>>> Also, source code inspection of __scsi_remove_device() revealed
>>>> a race condition in this function: no new SCSI requests must be
>>>> accepted for a SCSI device after device removal started.
>>>>
>>>> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
>>>> Cc: Mike Christie <michaelc@xxxxxxxxxxx>
>>>> Cc: James Bottomley <JBottomley@xxxxxxxxxxxxx>
>>>> Cc: Jens Axboe <axboe@xxxxxxxxx>
>>>> Cc: Joe Lawrence <jdl1291@xxxxxxxxx>
>>>> Cc: Jun'ichi Nomura <j-nomura@xxxxxxxxxxxxx>
>>>> Cc: <stable@xxxxxxxxxx>
>>>> ---
>>>>  drivers/scsi/scsi_lib.c   |    7 ++++---
>>>>  drivers/scsi/scsi_sysfs.c |   11 +++++++++--
>>>>  2 files changed, 13 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>>>> index 082c1e5..b722a8b 100644
>>>> --- a/drivers/scsi/scsi_lib.c
>>>> +++ b/drivers/scsi/scsi_lib.c
>>>> @@ -158,10 +158,11 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, int unbusy)
>>>>  	 * that are already in the queue.
>>>>  	 */
>>>>  	spin_lock_irqsave(q->queue_lock, flags);
>>>> -	blk_requeue_request(q, cmd->request);
>>>> +	if (!blk_queue_dead(q)) {
>>>> +		blk_requeue_request(q, cmd->request);
>>>> +		kblockd_schedule_work(q, &device->requeue_work);
>>>> +	}
>>>>  	spin_unlock_irqrestore(q->queue_lock, flags);
>>>> -
>>>> -	kblockd_schedule_work(q, &device->requeue_work);
>>>
>>> If we do not have the part of the patch above, but have your other
>>> patches and the code below, will we be ok?
>>
>>
>> I'm not sure. Without the above part the request could get killed after
>> the blk_requeue_request() call finished but before the requeue_work is
>> scheduled, e.g. because the request timer fired or due to a
>> blk_abort_queue() call.
>>
> 
> You are right.
> 
> What if we moved the requeue work struct to the request queue, then have
> blk_cleanup_queue or blk_drain_queue call cancel_work_sync before the
> queue is freed. That way that code could make sure the queue and work is
> flushed and drained, and it can make sure it is flushed and drained
> before freeing the queue?

Or, in scsi_requeue_run_queue could we just add a check for the
scsi_device being in the SDEV_DEL state. That combined with your cancel
call in __scsi_remove_device would prevent us from running a cleaned up
queue, right?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html