On 06/06/12 14:12, Mike Christie wrote: > On 06/06/2012 08:43 AM, Mike Christie wrote: >> On 06/06/2012 07:25 AM, Bart Van Assche wrote: >>> On 06/05/12 22:08, Mike Christie wrote: >>> >>>> On 06/05/2012 12:14 PM, Bart Van Assche wrote: >>>>> Avoid that the code for requeueing SCSI requests triggers a >>>>> crash by making sure that that code isn't scheduled anymore >>>>> after a device has been removed. >>>>> >>>>> Also, source code inspection of __scsi_remove_device() revealed >>>>> a race condition in this function: no new SCSI requests must be >>>>> accepted for a SCSI device after device removal started. >>>>> >>>>> Signed-off-by: Bart Van Assche <bvanassche@xxxxxxx> >>>>> Cc: Mike Christie <michaelc@xxxxxxxxxxx> >>>>> Cc: James Bottomley <JBottomley@xxxxxxxxxxxxx> >>>>> Cc: Jens Axboe <axboe@xxxxxxxxx> >>>>> Cc: Joe Lawrence <jdl1291@xxxxxxxxx> >>>>> Cc: Jun'ichi Nomura <j-nomura@xxxxxxxxxxxxx> >>>>> Cc: <stable@xxxxxxxxxx> >>>>> --- >>>>> drivers/scsi/scsi_lib.c | 7 ++++--- >>>>> drivers/scsi/scsi_sysfs.c | 11 +++++++++-- >>>>> 2 files changed, 13 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >>>>> index 082c1e5..b722a8b 100644 >>>>> --- a/drivers/scsi/scsi_lib.c >>>>> +++ b/drivers/scsi/scsi_lib.c >>>>> @@ -158,10 +158,11 @@ static void __scsi_queue_insert(struct scsi_cmnd *cmd, int reason, int unbusy) >>>>> * that are already in the queue. >>>>> */ >>>>> spin_lock_irqsave(q->queue_lock, flags); >>>>> - blk_requeue_request(q, cmd->request); >>>>> + if (!blk_queue_dead(q)) { >>>>> + blk_requeue_request(q, cmd->request); >>>>> + kblockd_schedule_work(q, &device->requeue_work); >>>>> + } >>>>> spin_unlock_irqrestore(q->queue_lock, flags); >>>>> - >>>>> - kblockd_schedule_work(q, &device->requeue_work); >>>> >>>> If we do not have the part of the patch above, but have your other >>>> patches and the code below, will we be ok? >>> >>> >>> I'm not sure. Without the above part the request could get killed after >>> the blk_requeue_request() call finished but before the requeue_work is >>> scheduled, e.g. because the request timer fired or due to a >>> blk_abort_queue() call. >>> >> >> You are right. >> >> What if we moved the requeue work struct to the request queue, then have >> blk_cleanup_queue or blk_drain_queue call cancel_work_sync before the >> queue is freed. That way that code could make sure the queue and work is >> flushed and drained, and it can make sure it is flushed and drained >> before freeing the queue? > > Or, in scsi_requeue_run_queue could we just add a check for the > scsi_device being in the SDEV_DEL state. That combined with your cancel > call in __scsi_remove_device would prevent us from running a cleaned up > queue, right? I'm not sure. If a requeued request times out before blk_cleanup_queue() is invoked then it's possible that the requeue_work is started after the struct scsi_device has already been deleted. Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html