On Sun, 2013-12-15 at 16:32 -0500, Alan Stern wrote: > On Sat, 14 Dec 2013, James Bottomley wrote: > > > > The refcount test and state change race with scsi_alloc_target(). > > > Maybe the race won't occur in practice, but to be safe you should hold > > > shost->host_lock throughout that time interval, as the original code > > > here does. > > > > You mean the fact that using a state model to indicate whether we should > > destroy a target without bothering to refcount isn't robust against two > > threads of execution running through a scan on the same target? > > I meant that the patch you posted suffers from a race when one thread > is adding a device to a target and another thread is removing an > existing device below that target at the same time. Suppose the > target's reap_ref count is initially equal to 1: > > Thread 0 Thread 1 > -------- -------- > In scsi_alloc_target(): In scsi_target_reap(): > lock host_lock scsi_target_reap_ref_put(): > find existing starget starget->reap_ref drops to 0 > incr starget->reap_ref In scsi_target_reap_ref_release(): > unlock host_lock device_del(&starget->dev); > starget->state == STARGET_DEL? > No => okay to use starget set starget->state = STARGET_DEL > > Result: We end up using starget _and_ removing it. The only way to > avoid this race would be to guarantee that we never add and remove > devices below the same target at the same time. In theory this is > feasible, but I don't know if you want to do it. > > This doesn't seem to be what you are talking about above. In any case, > it is a bug. No, I was thinking of the two thread scan bug (i.e. two scan threads) not one scan and one remove, which is a bug in the old code. This is a race between put and get when the kref is incremented from zero (an illegal operation which triggers a warn on). The way to mediate this is to check for the kref already being zero condition, like below. James --- diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 327c0e92..303d471 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -473,7 +473,13 @@ static struct scsi_target *scsi_alloc_target(struct device *parent, return starget; found: - kref_get(&found_target->reap_ref); + if (!kref_get_unless_zero(&found_target->reap_ref)) + /* + * release routine already fired. Target is dead, but + * STARGET_DEL may not yet be set (set in the release + * routine), so set here as well, just in case + */ + found_target->state = STARGET_DEL; spin_unlock_irqrestore(shost->host_lock, flags); if (found_target->state != STARGET_DEL) { put_device(dev); -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html