On Mon, 11 Jul 2011, Vivek Goyal wrote: > > > There's still the issue that Stefan Richter pointed out: The test for a > > > dead queue must be made _after_ acquiring the queue lock, not _before_. > > > > Yes, quite important. > > > > Jens, can you tweak the patch or should Roland send a v2? > > I do not think that we should do queue dead check after taking a spinlock. > The reason being that there are life time issues of two objects. > > - Validity of request queue pointer > - Validity of q->spin_lock pointer > > If the dm has taken the reference to the request queue in the beginning > then it can be sure request queue pointer is valid. But spin_lock might > be coming from driver and might be in one of driver allocated structures. > So it might happen that driver has called blk_cleanup_queue() and freed > up structures which contained the spin lock. Surely this is a bug in the design of the block layer? > So if queue is not dead, we know that q->spin_lock is valid. I think > only race present here is that whole operation is not atomic. First > we check for queue not dead flag and then go on to acquire request > queue lock. So this leaves a small window for race. I think I have > seen other code written in such manner (__generic_make_request()). So > it proably reasonably safe to do here too. "Probably reasonably safe" = "unsafe". The fact that it will usually work out okay means that when it does fail, it will be very difficult to track down. It needs to be fixed _now_, when people are aware of the issue. Not five years from now, when everybody has forgotten about it. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html