On Wed, 2012-04-11 at 12:10 -0400, Martin K. Petersen wrote: > >>>>> "Mike" == Mike Christie <michaelc@xxxxxxxxxxx> writes: > > >>>> I have observed crashes at the same point while testing device > >>>> removal with the ib_srp driver. As far as I can see that code was > >>>> added through commit 18a4d0a22ed6c54b67af7718c305cd010f09ddf8 > >>>> (February 9, 2012). The approach of that patch looks questionable > >>>> to me: what guarantees that the struct scsi_driver will be > >>>> available at the time the SCSI error handler needs it ? > > Sorry about that! > > > Mike> That is wrong. I guess REQ_DISCARD and REQ_FLUSH will, so I guess > Mike> we just have to check for a NULL sdrv above. > > How about we do this? > > > SCSI: Fix error handling when no ULD is attached > > Commit 18a4d0a2 introduced a bug in which we would attempt to > dereference the scsi driver even when the device had no ULD attached. > > Ensure that a driver is registered and make the driver accessor function > more resilient to errors during device discovery. > > Reported-by: Elric Fu <elricfu1@xxxxxxxxx> > Reported-by: Bart Van Assche <bvanassche@xxxxxxx> > Signed-off-by: Martin K. Petersen <martin.petersen@xxxxxxxxxx> > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 2cfcbff..386f0c5 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -835,7 +835,7 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd, > > scsi_eh_restore_cmnd(scmd, &ses); > > - if (sdrv->eh_action) > + if (sdrv && sdrv->eh_action) > rtn = sdrv->eh_action(scmd, cmnd, cmnd_size, rtn); > > return rtn; > diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h > index 377df4a..1e11985 100644 > --- a/include/scsi/scsi_cmnd.h > +++ b/include/scsi/scsi_cmnd.h > @@ -134,6 +134,9 @@ struct scsi_cmnd { > > static inline struct scsi_driver *scsi_cmd_to_driver(struct scsi_cmnd *cmd) > { > + if (!cmd->request->rq_disk) > + return NULL; > + > return *(struct scsi_driver **)cmd->request->rq_disk->private_data; > } I'm not entirely convinced by this. If medium access timeout processing is so important, shouldn't it be done all the time rather than only when the ULD is bound? In which case it should be part of the error handler core? James ��.n��������+%������w��{.n�����{������ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f