On Thu, May 28 2015 at 10:54am -0400, Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > On 05/28/15 16:07, Mike Snitzer wrote: > >On Thu, May 28 2015 at 9:10P -0400, > >Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > > > >>On Thu, May 28 2015 at 4:19am -0400, > >>Bart Van Assche <bart.vanassche@xxxxxxxxxxx> wrote: > >> > >>>On 05/28/15 00:37, Mike Snitzer wrote: > >>>>FYI, I've staged a variant patch for 4.1 that is simpler; along with the > >>>>various fixes I've picked up from Junichi and the leak fix I emailed > >>>>earlier. They are now in linux-next and available in this 'dm-4.1' > >>>>specific branch (based on 4.1-rc5): > >>>>https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-4.1 > >>>> > >>>>Please try and let me know if your test works. > >>> > >>>No data corruption was reported this time but a very large number of > >>>memory leaks were reported by kmemleak. The initiator system ran out > >>>of memory after some time due to these leaks. Here is an example of > >>>a leak reported by kmemleak: > >>> > >>>unreferenced object 0xffff8800a39fc1a8 (size 96): > >>> comm "srp_daemon", pid 2116, jiffies 4294955508 (age 137.600s) > >>> hex dump (first 32 bytes): > >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > >>> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > >>> backtrace: > >>> [<ffffffff81600029>] kmemleak_alloc+0x49/0xb0 > >>> [<ffffffff81167d19>] kmem_cache_alloc_node+0xd9/0x190 > >>> [<ffffffff81425400>] scsi_init_request+0x20/0x40 > >>> [<ffffffff812cbb98>] blk_mq_init_rq_map+0x228/0x290 > >>> [<ffffffff812cbcc6>] blk_mq_alloc_tag_set+0xc6/0x220 > >>> [<ffffffff81427488>] scsi_mq_setup_tags+0xc8/0xd0 > >>> [<ffffffff8141e34f>] scsi_add_host_with_dma+0x6f/0x300 > >>> [<ffffffffa04c62bf>] srp_create_target+0x11cf/0x1600 [ib_srp] > >>> [<ffffffff813f9c93>] dev_attr_store+0x13/0x20 > >>> [<ffffffff81200a33>] sysfs_kf_write+0x43/0x60 > >>> [<ffffffff811fff8b>] kernfs_fop_write+0x13b/0x1a0 > >>> [<ffffffff81183e53>] __vfs_write+0x23/0xe0 > >>> [<ffffffff81184524>] vfs_write+0xa4/0x1b0 > >>> [<ffffffff811852d4>] SyS_write+0x44/0xb0 > >>> [<ffffffff81613cdb>] system_call_fastpath+0x16/0x73 > >>> [<ffffffffffffffff>] 0xffffffffffffffff > >> > >>I suspect I'm missing some cleanup of the request I got from the > >>underlying blk-mq device. I'll have a closer look. > > > >BTW, your test was with the dm-4.1 branch right? > > > >The above kmemleak trace clearly speaks to dm-mpath's ->clone_and_map_rq > >having allocated the underlying scsi-mq request. So it'll later require > >a call to dm-mpath's ->release_clone_rq to free the associated memory -- > >which happens in dm.c:free_rq_clone(). > > > >But I'm not yet seeing where we'd be missing a required call to > >free_rq_clone() in the DM core error paths. You can try this patch to > >see if you hit the WARN_ON but I highly doubt you would.. similarly the > >clone request shouldn't ever be allocated (nor tio->clone initialized) > >in the REQUEUE case: > > Hello Mike, > > This occurred with the dm-4.1 branch merged with the for-4.2 IB > branch. The leak was reported for regular I/O and before I started > to trigger path failures. I had a look myself at how the > sense_buffer pointer is manipulated by the scsi-mq code but could > not find anything that is wrong. So the next I did was to repeat my > test with kmemleak disabled. During this test the number of > kmalloc-96 objects in /proc/slabinfo remained constant. So I > probably have hit a bug in kmemleak. Maybe the code that clears and > restores the sense buffer pointer in scsi_mq_prep_fn() is confusing > kmemleak ? Sorry for the noise. Ah, no problem, very good news (albeit strange)! So you can confirm that with dm-4.1 your test passes? If possible please try your test a fews times. Also, if time permits, please vary scsi-mq and dm-mq enable/disable (4 permutations) to make sure all supported modes pass your SRP torture test. I just have to review Junichi's patch from today to silence the WARN_ON I added; once I work through that I'll likely send dm-4.1 to Linus. Thanks for all your help testing. Mike -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel