> > On Sat, Jul 2, 2011 at 12:59 PM, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > > On Sat, 2 Jul 2011, Andi Kleen wrote: > > > >> > The problem is that blk_peek_request() calls scsi_prep_fn(), which > >> > does this: > >> > > >> > struct scsi_device *sdev = q->queuedata; > >> > int ret = BLKPREP_KILL; > >> > > >> > if (req->cmd_type == REQ_TYPE_BLOCK_PC) > >> > ret = scsi_setup_blk_pc_cmnd(sdev, req); > >> > return scsi_prep_return(q, req, ret); > >> > > >> > It doesn't check to see if sdev is NULL, nor does > >> > scsi_setup_blk_pc_cmnd(). That accounts for this error: > >> > >> I actually added a NULL check in scsi_setup_blk_pc_cmnd early on, > >> but that just caused RCU CPU stalls afterwards and then eventually > >> a hung system. > > > > The RCU problem is likely to be a separate issue. It might even be a > > result of the use-after-free problem with the elevator. > > > > At any rate, it's clear that the crash in the refcounting log you > > posted occurred because scsi_setup_blk_pc_cmnd() called > > scsi_prep_state_check(), which tried to dereference the NULL pointer. > > > > Would you like to try this patch to see if it fixes the problem? As I > > said before, I'm not certain it's the best thing to do, but it worked > > on my system. > > > > Alan Stern > > > > > > > > > > Index: usb-3.0/drivers/scsi/scsi_lib.c > > =================================================================== > > --- usb-3.0.orig/drivers/scsi/scsi_lib.c > > +++ usb-3.0/drivers/scsi/scsi_lib.c > > @@ -1247,6 +1247,8 @@ int scsi_prep_fn(struct request_queue *q > > struct scsi_device *sdev = q->queuedata; > > int ret = BLKPREP_KILL; > > > > + if (!sdev) > > + return ret; > > if (req->cmd_type == REQ_TYPE_BLOCK_PC) > > ret = scsi_setup_blk_pc_cmnd(sdev, req); > > return scsi_prep_return(q, req, ret); > > Index: usb-3.0/drivers/scsi/scsi_sysfs.c > > =================================================================== > > --- usb-3.0.orig/drivers/scsi/scsi_sysfs.c > > +++ usb-3.0/drivers/scsi/scsi_sysfs.c > > @@ -322,6 +322,8 @@ static void scsi_device_dev_release_user > > kfree(evt); > > } > > > > + /* Freeing the queue signals to block that we're done */ > > + scsi_free_queue(sdev->request_queue); > > blk_put_queue(sdev->request_queue); > > /* NULL queue means the device can't be used */ > > sdev->request_queue = NULL; > > @@ -936,8 +938,6 @@ void __scsi_remove_device(struct scsi_de > > /* cause the request function to reject all I/O requests */ > > sdev->request_queue->queuedata = NULL; > > > > - /* Freeing the queue signals to block that we're done */ > > - scsi_free_queue(sdev->request_queue); > > put_device(dev); > > } > > This patch seems to resolve the block/scsi null-ptr de-references in > our libsas/isci environment, we have yet to try James' alternative > [1]. Do we potentially need both? > > Commit 86cbfb56 moved scsi_free_queue to __scsi_remove_device() but it > seems only the "sdev->request_queue->queuedata = NULL" needed to be > moved? > > The conversation appeared to be awaiting test results... > > [1]: http://marc.info/?l=linux-scsi&m=131007155700831&w=2 > > -- > Dan [Jack Wang] This patch fix kernel panic issue when hot-plut disk during I/O, I test it using pm8001 with 3.0.0-rc6 with above patch. > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html