Hi James,
James Bottomley wrote:
On Tue, 2009-01-13 at 14:50 +0100, Hannes Reinecke wrote:
Hi all,
currently we have this:
struct scsi_device *scsi_device_lookup_by_target(struct scsi_target *starget,
uint lun)
{
struct scsi_device *sdev;
struct Scsi_Host *shost = dev_to_shost(starget->dev.parent);
unsigned long flags;
spin_lock_irqsave(shost->host_lock, flags);
sdev = __scsi_device_lookup_by_target(starget, lun);
if (sdev && scsi_device_get(sdev))
sdev = NULL;
spin_unlock_irqrestore(shost->host_lock, flags);
return sdev;
}
now consider an sdev list with two entries for LUN 0, the first
being in SDEV_DEL and the second being in SDEV_RUNNING.
scsi_device_lookup_by_target will always return NULL here, as
it will never be able to skip past the first (unuseable)
entry. So scsi_report_lun_scan will happily create duplicate
sdevs for LUN 0 and we'll be getting the infamous
sysfs: duplicate filename 'xxxx' can not be created
errors.
What we should be doing here is to restart
__scsi_device_lookup_by_target() if we find an
sdev but cannot use it (ie is in SDEV_DEL).
James, please apply.
Cheers,
Hannes
plain text document attachment (scsi-restart-lookup-by-target)
Restart scsi_device_lookup_by_target()
When scsi_device_lookup_by_target() will always return
the first sdev with a matching LUN, regardless of
the state. However, when this sdev is in SDEV_DEL
it's quite possible to have additional sdevs in the
list with the same LUN which are active.
So we should restart __scsi_device_lookup_by_target()
whenever we find an sdev with state SDEV_DEL.
Signed-off-by: Hannes Reinecke <hare@xxxxxxx>
That's brilliant, thanks for finding this bug.
rather than go to the trouble of restarting the scan looking for
duplicates, since none of the users of __scsi_device_lookup_by_target
actually want invisible dying devices in SDEV_DEL, why not just make it
skip over them. If we ignore devices in SDEV_DEL, the target id of the
visible devices should all be unique, so alter the
if (sdev->lun == lun)
to
if (sdev->lun == lun && sdev->sdev_state != SDEV_DEL)
And everything should work.
I think this is the correct fix because scsi_device_get() is also
programmed to ignore devices in SDEV_DEL (for the same reason).
As already mentioned, I'm not sure if esp_scsi.c expects it to
be that way. If someone confirms that we won't break it with this
change, then of course it's the easier way.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@xxxxxxx +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html