Re: [RFC 2/2] dual scan thread bug fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2014-01-03 at 10:58 -0500, Alan Stern wrote:
> On Thu, 2 Jan 2014, James Bottomley wrote:
> 
> > In the highly unusual case where two threads are running concurrently through
> > the scanning code scanning the same target, we run into the situation where
> > one may allocate the target while the other is still using it.  In this case,
> > because the reap checks for STARGET_CREATED and kills the target without
> > reference counting, the second thread will do the wrong thing on reap.
> > 
> > Fix this by reference counting even creates and doing the STARGET_CREATED
> > check in the final put.
> 
> I'm still concerned about one thing.  The previous patch does this in
> scsi_alloc_target():
> 
> >   found:
> > -	found_target->reap_ref++;
> > +	if (!kref_get_unless_zero(&found_target->reap_ref))
> > +		/*
> > +		 * release routine already fired.  Target is dead, but
> > +		 * STARGET_DEL may not yet be set (set in the release
> > +		 * routine), so set here as well, just in case
> > +		 */
> > +		found_target->state = STARGET_DEL;
> >  	spin_unlock_irqrestore(shost->host_lock, flags);
> 
> As a result, the two comments in this patch aren't right:
> 
> > @@ -384,9 +385,15 @@ static void scsi_target_reap_ref_release(struct kref *kref)
> >  	struct scsi_target *starget
> >  		= container_of(kref, struct scsi_target, reap_ref);
> >  
> > -	transport_remove_device(&starget->dev);
> > -	device_del(&starget->dev);
> > -	starget->state = STARGET_DEL;
> > +	/*
> > +	 * if we get here and the target is still in the CREATED state that
> > +	 * means it was allocated but never made visible (because a scan
> > +	 * turned up no LUNs), so don't call device_del() on it.
> > +	 */
> > +	if (starget->state == STARGET_RUNNING) {
> > +		transport_remove_device(&starget->dev);
> > +		device_del(&starget->dev);
> > +	}
> 
> Here the state could already be STARGET_DEL, even though the target is
> still visible.

Well, I agree with the theory.  In practise, there are only a few
machine instructions between the kref going to zero and us reaching that
point, because kref_release will jump into this routine next, so the
condition would be very hard to see.  However, I suppose it's easy to
close by checking for != STARGET_CREATED and there's no reason not to do
that, so I'll change it.

> Also, it's a little odd that the comment talks about CREATED but the 
> code really checks for RUNNING.  They should be consistent.

!= STARGET_CREATED should solve this.

> > @@ -506,11 +513,13 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
> >   */
> >  void scsi_target_reap(struct scsi_target *starget)
> >  {
> > +	/*
> > +	 * serious problem if this triggers: STARGET_DEL is only set in the
> > +	 * kref release routine, so we're doing another final put on an
> > +	 * already released kref
> > +	 */
> >  	BUG_ON(starget->state == STARGET_DEL);
> 
> Here the code is okay but the comment is wrong: STARGET_DEL is set in 
> _two_ places (but neither of them runs until reap_ref has reached 0).
> 
> Would it be better in scsi_alloc_target() to behave as though the state 
> were STARGET_DEL without actually setting it?

Yes, I'll update the comment to it only goes to DEL after the kref goes
to zero.

How does the attached diff look?

James

---

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 82cf902..2f7de33 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -390,7 +390,7 @@ static void scsi_target_reap_ref_release(struct kref *kref)
 	 * means it was allocated but never made visible (because a scan
 	 * turned up no LUNs), so don't call device_del() on it.
 	 */
-	if (starget->state == STARGET_RUNNING) {
+	if (starget->state != STARGET_CREATED) {
 		transport_remove_device(&starget->dev);
 		device_del(&starget->dev);
 	}
@@ -514,9 +514,9 @@ static struct scsi_target *scsi_alloc_target(struct device *parent,
 void scsi_target_reap(struct scsi_target *starget)
 {
 	/*
-	 * serious problem if this triggers: STARGET_DEL is only set in the
-	 * kref release routine, so we're doing another final put on an
-	 * already released kref
+	 * serious problem if this triggers: STARGET_DEL is only set in the if
+	 * the reap_ref drops to zero, so we're trying to do another final put
+	 * on an already released kref
 	 */
 	BUG_ON(starget->state == STARGET_DEL);
 	scsi_target_reap_ref_put(starget);




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux