Re: [PATCH 5/5] SCSI: fix target allocation outside of scan_mutex

Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> · Fri, 19 Feb 2010 15:21:08 -0500 (EST)

On Fri, 19 Feb 2010, James Bottomley wrote:

> On Fri, 2010-02-12 at 12:14 -0500, Alan Stern wrote:
> > This patch (as1337) fixes a bug in __scsi_add_device().  It calls
> > scsi_alloc_target() outside the protection of the host's scan_mutex,
> > meaning that it might find an incompletely-initialized target or it
> > might create a duplicate target.
> 
> I don't think this is correct.  scsi_alloc_target should be completely
> atomic on the host lock, if you look.  It allocates the proposed target,
> takes the lock and searches the host target list ... if it finds
> something it drops the lock, destroys the proposed target allocations
> and returns what it found.  If it finds nothing, it adds the proposed
> target to the list drops the lock and returns it.  There's some
> complexity around finding dying targets, but nothing that I think needs
> to be mediated by the scan mutex.

You're right that a duplicate target wouldn't get created.  Still,
there really is a bug.  Suppose two different threads call
scsi_alloc_target() for the same target ID at about the same time.  The
first thread won't find an existing target on the list, so it will
allocate and initialize a new target structure.  However, it releases
the host lock before calling the host template's target_alloc method.

Meanwhile, the second thread will find the new target structure on the 
list and will therefore feel free to start using it.  If the timing 
works out, it could start using the new target before the first thread 
calls target_alloc.

Perhaps you would say a better solution to this race is to move the
target_alloc call into the spinlocked region.  Fine.  But there would
still be a problem.  The whole purpose of the scan_mutex is to prevent
threads from adding new devices below a host that's in the process of
removal.  Since __scsi_add_device() doesn't use the scan_mutex, there's
nothing to prevent it from doing exactly that.

In fact, there's yet another race.  This is between scsi_alloc_target()
and scsi_target_reap().  Because scsi_alloc_target() doesn't check
whether reap_ref is 0 before incrementing it, we could have something
like this:

	Thread 0			Thread 1
	--------			--------
	call scsi_target_reap()
	lock host_lock
	--starget->reap_ref goes to 0
	unlock host_lock
					call scsi_target_alloc()
					lock host_lock
					find found_target = starget
					found_target->reap_ref++
					unlock host_lock
					found_target->state != STARGET_DEL,
						so begin using it
	starget->state = STARGET_DEL
	remove starget

To prevent this while relying only on the host lock,
scsi_alloc_target() should check the value of reap_ref before
incrementing it.  It also wouldn't hurt to make scsi_alloc_target()  
check the target state and scsi_target_reap() set the state all under 
the protection of the host lock.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html