Re: [usb-storage] UAS hangs khubd on USB disconnect

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Fri, 13 Dec 2013 12:03:19 -0800

On Fri, 2013-12-13 at 11:18 -0800, James Bottomley wrote:
> On Fri, 2013-12-13 at 13:33 -0500, Tejun Heo wrote:
> > Hello, guys.
> > 
> > (cc'ing Greg)
> > 
> > On Fri, Dec 13, 2013 at 01:19:36PM -0500, Alan Stern wrote:
> > > On Fri, 13 Dec 2013, Sarah Sharp wrote:
> > > 
> > > > > Given the way things work now, I suspect these warnings are truly 
> > > > > harmless.  We could simply get rid of the WARN in sysfs_remove_group.
> > > > > 
> > > > > The alternative is to call device_del for SCSI targets earlier on, such 
> > > > > as when their hosts are unregistered.  I don't know how James would 
> > > > > feel about this approach.  It would be difficult because targets use 
> > > > > their own reference counts instead of relying on the usual device 
> > > > > refcounting mechanism.
> > > > 
> > > > Thanks for looking into this.  I think just getting rid of the WARN
> > > > would be sufficient.  Can you make a patch for that?
> > > 
> > > Easily.  The downside is that there would no longer be any warning 
> > > when someone tries to remove a wrong subdirectory by mistake.
> > > 
> > > > The patch still won't help with the UAS issues with
> > > > scsi_init_shared_tag_map though.
> > > 
> > > I wasn't clear on the reason for that problem.  Does it also arise from 
> > > late device_del for scsi_target?  I could try to change the way that 
> > > works, if anybody (Hans?) would like to test it.
> > 
> > While the recent sysfs changes made this issue more visible, Greg
> > wants to make sure that devices are removed from leaf up in all cases
> > and keep the warning to ensure that.  Would there be a way fix SCSI
> > removal ordering?
> 
> Could someone analyse the actual problem?  We're quite careful even on
> host remove to iterate and remove all the devices, then targets, then
> host (and allied transport objects).  Which removal is inverted?

Actually, I think I have this figured out.  There's a thinko in one of
the scsi_target_reap() cases.  The original (and still existing) problem
with targets is that nothing creates them and nothing destroys them, so,
while we could rely on the refcounting of the device model to preserve
the actual target object, we had no idea when to remove it from
visibility.  That was the job of the reap reference, to track
visibility.  It looks like the reap on device last put is occurring too
late.  I think we should reap immediately after doing the sdev
device_del, so does this fix the warn on? (I'm not sure because no-one
has actually posted a backtrace, but it sounds like this is the
problem).

James

---

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 8ff62c2..98d4eb3 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -399,8 +399,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	/* NULL queue means the device can't be used */
 	sdev->request_queue = NULL;
 
-	scsi_target_reap(scsi_target(sdev));
-
 	kfree(sdev->inquiry);
 	kfree(sdev);
 
@@ -1044,6 +1042,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	} else
 		put_device(&sdev->sdev_dev);
 
+	scsi_target_reap(scsi_target(sdev));
+
 	/*
 	 * Stop accepting new requests and wait until all queuecommand() and
 	 * scsi_run_queue() invocations have finished before tearing down the


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html