On Sun, 2008-08-17 at 22:47 -0500, Mike Christie wrote: > James Bottomley wrote: > > On Sun, 2008-08-17 at 15:18 -0500, michaelc@xxxxxxxxxxx wrote: > >> From: Mike Christie <michaelc@xxxxxxxxxxx> > >> > >> The patch and description is from Aaro Koskinen. He sent us the > >> patch against our fedora kernel, but he is short on time and did not have > >> time to send it upstream, so I am sending it for him so it does not sit in > >> just our trees. > >> > >> This patch applies to scsi-fc-fixes. > > > > One of the things that's missing from this is really the problem it's > > trying to solve ... the below is just an analysis of potential bugs in > > the sym2 code. > > > > Sorry. I meant to add a link to the mail with the bug report. Here is my > initial post: > http://marc.info/?l=linux-scsi&m=120898142212407&w=2 > > Basically users are trying to do a hot unplug and hotplug add of a disk. > They will do: > > 1. echo 1 > /sys/block/sdb/device/delete > (or do it from proc) > 2. Remove the disk physically. > 3. Insert new disk. > 4. Rescan from sysfs > (or from proc). > 5. For the rescan we can either get: > > A. inquiry from scsi_scan.c will timeout and the driver's bus reset > funtion will do a BUS RESET (bdr failed and so we got to the bus reset > handler). This will succeed, and when the inqiury is resent it will > succeed and the rescan will find the device and everything is fine. > > B. inquiry is failed with 0x100ff. We see this error message from > scsi_scan.c: > > scsi_scan_host_selected: <1:0:0:0> > scsi scan: INQUIRY to host 1 channel 0 id 0 lun 0 > scsi scan: 1st INQUIRY failed with code 0x100ff > > > I will let Aaro handle the other questions because I know nothing about SPI. Actually, the report says everything goes fine if they notify the mid layer through sysfs before doing the removal. The problem is on unnotified hot swap with a rescan afterwards. The problem in the second case is the parameter mismatch. The HBA is thinking previous transfer parameters and the drive is thinking async. Tacking the parameters on to the messages before inquiry and request sense is the standards suggested way out of this, so that's what the driver should be doing. However, unnotified hot swap is also the completely incorrect way of doing this. You could end up with the kernel connecting a device or filesystem wrongly. Notify, remove, add and rescan is the correct way of handling this. James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html