Re: [PATCH 4/4] scsi: Stop accepting SCSI requests before removing a device

Mike Christie <michaelc@xxxxxxxxxxx> · Wed, 06 Jun 2012 08:43:07 -0500

On 06/06/2012 07:25 AM, Bart Van Assche wrote:
> On 06/05/12 22:08, Mike Christie wrote:
> 
>> On 06/05/2012 12:14 PM, Bart Van Assche wrote:
>>> Avoid that the code for requeueing SCSI requests triggers a
>>> crash by making sure that that code isn't scheduled anymore
>>> after a device has been removed.
>>>
>>> Also, source code inspection of __scsi_remove_device() revealed
>>> a race condition in this function: no new SCSI requests must be
>>> accepted for a SCSI device after device removal started.
>>>
>>> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx>
>>> Cc: Mike Christie <michaelc@xxxxxxxxxxx>
>>> Cc: James Bottomley <JBottomley@xxxxxxxxxxxxx>
>>> Cc: Jens Axboe <axboe@xxxxxxxxx>
>>> Cc: Joe Lawrence <jdl1291@xxxxxxxxx>
>>> Cc: Jun'ichi Nomura <j-nomura@xxxxxxxxxxxxx>
>>> Cc: <stable@xxxxxxxxxx>
>>> ---
>>>  drivers/scsi/scsi_lib.c   |    7 ++++---
>>>  drivers/scsi/scsi_sysfs.c |   11 +++++++++--
>>>  2 files changed, 13 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>>> index 082c1e5..b722a8b 100644
>>> --- a/drivers/scsi/scsi_lib.c
>>> +++ b/drivers/scsi/scsi_lib.c
>>> @@ -158,10 +158,11 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, int unbusy)
>>>  	 * that are already in the queue.
>>>  	 */
>>>  	spin_lock_irqsave(q->queue_lock, flags);
>>> -	blk_requeue_request(q, cmd->request);
>>> +	if (!blk_queue_dead(q)) {
>>> +		blk_requeue_request(q, cmd->request);
>>> +		kblockd_schedule_work(q, &device->requeue_work);
>>> +	}
>>>  	spin_unlock_irqrestore(q->queue_lock, flags);
>>> -
>>> -	kblockd_schedule_work(q, &device->requeue_work);
>>
>> If we do not have the part of the patch above, but have your other
>> patches and the code below, will we be ok?
> 
> 
> I'm not sure. Without the above part the request could get killed after
> the blk_requeue_request() call finished but before the requeue_work is
> scheduled, e.g. because the request timer fired or due to a
> blk_abort_queue() call.
> 

You are right.

What if we moved the requeue work struct to the request queue, then have
blk_cleanup_queue or blk_drain_queue call cancel_work_sync before the
queue is freed. That way that code could make sure the queue and work is
flushed and drained, and it can make sure it is flushed and drained
before freeing the queue?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html