On Thu, 2011-08-11 at 10:59 -0400, Alan Stern wrote: > On Thu, 11 Aug 2011, James Bottomley wrote: > > > > If the reason you moved scsi_free_queue into scsi_remove_device > > > is marking the queue dead, how about the following patch? > > > Do you think it's acceptable? > > > > Well, it's just hiding the problem. The essential problem is that only > > block has the correctly refcounted knowledge to know the last release of > > the queue reference. Until that time, the holder of the reference can > > use the queue regardless of whether blk_cleanup_queue() has been called. > > This is the race you complain about since use of the queue involves the > > lock which should be guarded by QUEUE_DEAD checks. > > > > This is essentially unfixable with function calls. The only way to fix > > it is to have a callback model for freeing the external lock. > > Assuming the queue is associated with a device, the queue could take a > reference to the device, dropping that reference when the queue is > freed. Then the lock could safely be freed at the same time as the > device. If that assumption is correct, there's no point refcounting the queue at all because its use is entirely subordinated to the lifecycle of the associated device. Plus all the wittering about my previous patch is pointless, because blk_cleanup_queue() has to do the final put of the queue in the lock free path (otherwise the assumption is violated). However, much as I'd like to accept this rosy view, the original oops that started all of this in 2.6.38 was someone caught something with a reference to a SCSI queue after the device release function had been called. James -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel