On Thu, May 07 2015 at 6:19am -0400, Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > On 05/06/15 20:29, Mike Snitzer wrote: > >On Wed, May 06 2015 at 3:45am -0400, > >Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > > > >>On 05/06/15 04:23, Mike Snitzer wrote: > >>>On Tue, May 05 2015 at 10:04am -0400, > >>>Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > >>>>While retesting my SRP initiator patches on top of kernel v4.1-rc2 > >>>>with DM_MQ_DEFAULT=y I ran into the kernel warning below. Does this > >>>>mean that I'm missing any device mapper related patches ? This > >>>>warning was reported shortly after scsi_remove_host() had been > >>>>invoked. > >>> > >>>I put the warning in place because, to me, if it triggers it speaks to > >>>unsafe teardown occuring (request is still completing but the queue it > >>>was issued from no longer exists). > >>> > >>>Like I said before I'm open to removing the WARN_ON_ONCE() if this > >>>scenario is perfectly valid. But I just haven't had time to revisit > >>>what appears to be a potentially serious problem with the underlying > >>>paths' teardown vs upper level mpath IO. > >>> > >>>I'll try to revisit this week. But I welcome input from others too. > >>> > >>>(Just thinking about it further now, it could be that the way the clone > >>>request is allocated in the case of blk-mq DM is as part of the original > >>>request's pdu... meaning there isn't a proper get_request() call against > >>>the underlying queue.. so the expected refcounting likely isn't > >>>happening. And given the request won't be free'd from that underlying > >>>request_queue there really isn't a need to artificially link these > >>>cloned requests with the underlying request_queue... so I'm now leaning > >>>toward just removing the WARN_ON_ONCE.. but I'll look closer tomorrow) > >> > >>Hello Mike, > >> > >>With CONFIG_SCSI_MQ_DEFAULT=y and CONFIG_DM_MQ_DEFAULT=n I just ran into > >>the bug report below. I will continue my v4.1-rc2 tests with SCSI_MQ=n. > > > >What were you doing when this happened? Quite a strange place to get a > >NULL pointer (it should be noted that for 4.2 hch's patch does away with > >cloning the request's bios). Is there an easy reproducer (unlikely > >considering I've tested CONFIG_SCSI_MQ_DEFAULT=y and > >CONFIG_DM_MQ_DEFAULT=n a fair amount). > > > >BTW, my "Just thinking about it further now" above was relative to > >CONFIG_DM_MQ_DEFAULT=y and CONFIG_SCSI_MQ_DEFAULT=n. > > Hello Mike, > > With kernel v4.1-rc2, with CONFIG_SCSI_MQ_DEFAULT=y and > CONFIG_DM_MQ_DEFAULT=n if I run "for p in > /sys/class/srp_remote_ports/*; do echo 1 > $p/delete; done" if no > I/O is running that command works fine. That command triggers a call > of scsi_remove_host(). But if I run the same command while I/O is > running the message "BUG: unable to handle kernel NULL pointer > dereference at 0000000000000068 / IP: blk_rq_prep_clone+0x87/0x160" > appears. I just reproduced this after having rebuilt the kernel > after a "make clean". Hey Bart, Looks like Junichi likely fixed this issue you reported, please try this patch: https://patchwork.kernel.org/patch/6487321/ Thanks, Mike -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel