Re: An oops will occur while SCSI core is being used in 3.4-rc1

"Rustad, Mark D" <mark.d.rustad@xxxxxxxxx> · Fri, 13 Apr 2012 00:30:28 +0000

On Apr 10, 2012, at 1:16 AM, Bart Van Assche wrote:

> On 04/10/12 01:22, Elric Fu wrote:
> 
>> After debugging the code, I found the issue happened while the driver ran to
>> line 782 in scsi_send_eh_cmnd().
>> 
>> 778 static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
>> 779                              int cmnd_size, int timeout, unsigned
>> sense_bytes)
>> 780 {
>> 781         struct scsi_device *sdev = scmd->device;
>> 782         struct scsi_driver *sdrv = scsi_cmd_to_driver(scmd);
>> 783         struct Scsi_Host *shost = sdev->host;
>> 784         DECLARE_COMPLETION_ONSTACK(done);
>> 785         unsigned long timeleft;
>> 786         struct scsi_eh_save ses;
>> 787         int rtn;
>> 
>> I know the code is submitted by you. I don't familiar with the scsi core.
>> It seems like the conversion process from scsi command to scsi driver
>> encounter a NULL pointer. Any idea?
> 
> I have observed crashes at the same point while testing device removal
> with the ib_srp driver. As far as I can see that code was added through
> commit 18a4d0a22ed6c54b67af7718c305cd010f09ddf8 (February 9, 2012). The
> approach of that patch looks questionable to me: what guarantees that
> the struct scsi_driver will be available at the time the SCSI error
> handler needs it ? At least the sd driver explicitly resets that pointer
> in its scsi_disk_release() function.

I am looking into a similar crash with FCoE, though in my case it is the private_data field that is NULL instead of rq_disk. The backtraces are very much like what has been reported here. I will try adding some NULL checks similar to what has been proposed on the list, but until I know more than I do now, I won't let myself believe that NULL checks are the real fix for this issue.

-- 
Mark Rustad, LAN Access Division, Intel Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html