On Mon, 2013-06-24 at 18:16 +0200, Bart Van Assche wrote: > On 06/24/13 17:38, James Bottomley wrote: > > I really don't like this because it's shuffling potentially fragile > > lifetime rules since you now have to have the sdev deleted from the > > starved list before final put. That becomes an unstated assumption > > within the code. > > > > The theory is that the starved list processing may be racing with a > > scsi_remove_device, so when we unlock the host lock, the device (and the > > queue) may be destroyed. OK, so I agree with this, remote a possibility > > though it may be. The easy way of fixing it without making assumptions > > is this, isn't it? All it requires is that the queue be destroyed after > > the starved list entry is deleted in the sdev release code. > > > > James > > > > --- > > > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > > index 86d5220..f294cc6 100644 > > --- a/drivers/scsi/scsi_lib.c > > +++ b/drivers/scsi/scsi_lib.c > > @@ -434,6 +434,7 @@ static void scsi_run_queue(struct request_queue *q) > > list_splice_init(&shost->starved_list, &starved_list); > > > > while (!list_empty(&starved_list)) { > > + struct request_queue *slq; > > /* > > * As long as shost is accepting commands and we have > > * starved queues, call blk_run_queue. scsi_request_fn > > @@ -456,10 +457,21 @@ static void scsi_run_queue(struct request_queue *q) > > continue; > > } > > > > + /* > > + * once we drop the host lock, a racing scsi_remove_device may > > + * remove the sdev from the starved list and destroy it and > > + * the queue. Mitigate by taking a reference to the queue and > > + * never touching the sdev again after we drop the host lock. > > + */ > > + slq = sdev->request_queue; > > + if (!blk_get_queue(slq)) > > + continue; > > + > > spin_unlock(shost->host_lock); > > - spin_lock(sdev->request_queue->queue_lock); > > - __blk_run_queue(sdev->request_queue); > > - spin_unlock(sdev->request_queue->queue_lock); > > + > > + blk_run_queue(slq); > > + blk_put_queue(slq); > > + > > spin_lock(shost->host_lock); > > } > > /* put any unprocessed entries back */ > > Since the above patch invokes blk_put_queue() with interrupts disabled > it may cause blk_release_queue() to be invoked with interrupts disabled. > Sorry but I'm not sure whether that will work fine. I think it is, but make the unlock/lock _irqrestore and _irqsave and the concern goes away. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html