On Wed, Jul 06, 2011 at 01:06:44AM -0700, Roland Dreier wrote: > > > I booted with slub_debug=FZUP and the 6b pattern in RAX pretty much > > > does prove that this is a use-after-free issue. Any thoughts about > > > how to pin this down before I muddle on in my lowbrow way? > > > > Alan Stern came up with a patch that could fix this: > > > > http://marc.info/?l=linux-kernel&m=130963676907731&w=2 > > Thanks! It seems my crash is actually a different problem (or perhaps > the same problem in different code), since I'm not going through SCSI > directly but rather a dm-multipath device on top of SCSI disks. But > it does seem that it is another case of the block queue elevator > getting freed while requests can still be submitted. > > In my case, to reproduce this I have to hold the multipath device file > open with something like "cat > /dev/dm-X" and then kill the > underlying drive. What I think is happening (although I haven't > traced all the layers to be sure) is that then the multipath daemon > notices that all the paths to the disk are lost and tries to kill the > multipath device, which ends up in dm.c:dm_destroy(). > > This ends up in blk_cleanup_queue() which frees q->elevator, which of > course leads to the crash. So it's the identical crash that I reported also (well, at least one of the crashes I reported). > What's not clear to me is how things are supposed to work. It seems > that the dm stuff at least is missing a lot of required reference > counting, to make sure that some structure sticks around to reject IOs > to the device after it is destroy but while it is still open. But I > don't understand why people don't hit this more since it is completely > reproducible for me with a fairly normal setup (hot-remove a multipath > device that some process has open). cc'ing Alasdair as well, maybe he knows... > Alan Stern's patch looks a bit fishy -- the scsi_free_queue() is moved > earlier than the > > /* cause the request function to reject all I/O requests */ > sdev->request_queue->queuedata = NULL; > > which seems to leave a small window where the use-after-free can > happen, and it's not clear to me why the scsi_free_queue() has to move > at all. > > Thanks, > - R. > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html