On Mon, Jun 6, 2016 at 11:12 AM, Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > > On Mon, Jun 06, 2016 at 04:13:34PM +0200, Christoph Hellwig wrote: > > I've got a few reports of this over the weekend, but it still doesn't > > make much sense to me. > > > > Could it be that kmemleak can't deal with the bio_batch logic? I've > > tried to look at the various bio and biovec number entries in > > /proc/slabinfo, and while they keep changing a bit during the > > system runtime there doesn't seem to be a persistent increase > > even after lots of mkfs calls. > > I think the reported leaks settle after about 10-20min (2-3 kmemleak > periodic scans), so checking /proc/slabinfo may not be sufficient if > the leak is not growing. The leaks also do not seem to disappear, > otherwise kmemleak would no longer report them (e.g. after kfree, even > if they had been previously reported). > > What kmemleak reports is objects for which it cannot find a pointer (to > anywhere inside that object; e.g. list_heads are handled). False > positives are indeed present sometimes but for cases where pointers are > stored in non-tracked objects like alloc_pages(). > > It seems that this leak reports always come in pairs. The first one: > > unreferenced object 0xffff880262cbe900 (size 256): > comm "NetworkManager", pid 516, jiffies 4294895670 (age 2479.340s) > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 c0 f3 ab 8a 00 88 ff ff ................ > 02 20 00 20 00 00 00 00 11 00 00 00 00 00 00 00 . . ............ > backtrace: > [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250 > [<ffffffff81175b42>] mempool_alloc+0x72/0x190 > [<ffffffff812dc9f6>] bio_alloc_bioset+0xb6/0x240 > [<ffffffff812eca7f>] next_bio+0x1f/0x50 > [<ffffffff812ecf1a>] blkdev_issue_zeroout+0xea/0x1d0 > [<ffffffffc025b610>] ext4_issue_zeroout+0x40/0x50 [ext4] > [<ffffffffc028c58d>] ext4_ext_map_blocks+0x144d/0x1bb0 [ext4] > [<ffffffff811839f4>] release_pages+0x254/0x310 > [<ffffffff8118437a>] __pagevec_release+0x2a/0x40 > [<ffffffffc025b297>] mpage_prepare_extent_to_map+0x227/0x2c0 [ext4] > [<ffffffffc025b793>] ext4_map_blocks+0x173/0x5d0 [ext4] > [<ffffffffc025f1d0>] ext4_writepages+0x700/0xd40 [ext4] > [<ffffffff8121312e>] legitimize_mnt+0xe/0x50 > [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250 > [<ffffffff81173fd5>] __filemap_fdatawrite_range+0xc5/0x100 > [<ffffffff81174113>] filemap_write_and_wait_range+0x33/0x70 > > is the first mempool_alloc() in bio_alloc_bioset() for struct bio and > front_pad. > > The second report: > > unreferenced object 0xffff880036488600 (size 256): > comm "NetworkManager", pid 516, jiffies 4294895670 (age 2479.348s) > hex dump (first 32 bytes): > 80 39 08 00 00 ea ff ff 00 10 00 00 00 00 00 00 .9.............. > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250 > [<ffffffff812dc8b7>] bvec_alloc+0x57/0xe0 > [<ffffffff812dcaaf>] bio_alloc_bioset+0x16f/0x240 > [<ffffffff812eca7f>] next_bio+0x1f/0x50 > [<ffffffff812ecf1a>] blkdev_issue_zeroout+0xea/0x1d0 > [<ffffffffc025b610>] ext4_issue_zeroout+0x40/0x50 [ext4] > [<ffffffffc028c58d>] ext4_ext_map_blocks+0x144d/0x1bb0 [ext4] > [<ffffffff811839f4>] release_pages+0x254/0x310 > [<ffffffff8118437a>] __pagevec_release+0x2a/0x40 > [<ffffffffc025b297>] mpage_prepare_extent_to_map+0x227/0x2c0 [ext4] > [<ffffffffc025b793>] ext4_map_blocks+0x173/0x5d0 [ext4] > [<ffffffffc025f1d0>] ext4_writepages+0x700/0xd40 [ext4] > [<ffffffff8121312e>] legitimize_mnt+0xe/0x50 > [<ffffffff811d244e>] kmem_cache_alloc+0xfe/0x250 > [<ffffffff81173fd5>] __filemap_fdatawrite_range+0xc5/0x100 > [<ffffffff81174113>] filemap_write_and_wait_range+0x33/0x70 > > is for the struct bio_vec allocation in bvec_alloc() (the one going via > kmem_cache_alloc). > > IIUC, the bio object above allocated via next_bio() -> > bio_alloc_bioset() is returned to __blkdev_issue_zeroout() which > eventually submits them either directly for the last one or via > next_bio(). > > Regarding bio chaining, I can't figure out what the first bio allocated > in __blkdev_issue_zeroout() is chained to since bio == NULL initially. > Subsequent next_bio() allocations are linked to the previous ones via > bio_chain() but somehow kmemleak loses track of the first one, hence the > subsequent bios are reported as leaks. That's unless the chaining should > be the other way around: > > diff --git a/block/blk-lib.c b/block/blk-lib.c > index 23d7f301a196..3bf78b7b74cc 100644 > --- a/block/blk-lib.c > +++ b/block/blk-lib.c > @@ -15,7 +15,7 @@ static struct bio *next_bio(struct bio *bio, int rw, unsigned int nr_pages, > struct bio *new = bio_alloc(gfp, nr_pages); > > if (bio) { > - bio_chain(bio, new); > + bio_chain(new, bio); > submit_bio(rw, bio); > } > > > Also confusing is that chaining is done via bio->bi_private, however > this is overridden in other places like submit_bio_wait(). > > However, since I don't fully understand this code, this chaining may not > even be essential to struct bio freeing (and I'm investigating the wrong > path). > > -- > Catalin > -- > To unsubscribe from this list: send the line "unsubscribe linux-block" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=CwIBAg&c=IGDlg0lD0b-nebmJJ0Kp8A&r=Wg5NqlNlVTT7Ugl8V50qIHLe856QW0qfG3WVYGOrWzA&m=uHYaMtk0CBsO1MWg4alX8kHC6bvXsmWCR9wL8nrX28E&s=4EHwF1z6acQYr9R052hraaTE1KUf6IiXcEmd-CSLV48&e= I'm pretty sure it is missing a bio_put() after submit_bio_wait(). Please excuse the hack-y patch but I think you need to do something like this ... (Note tabs eaten by gmail). diff --git a/block/blk-lib.c b/block/blk-lib.c index 23d7f30..9e29dc3 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -113,6 +113,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, ret = submit_bio_wait(type, bio); if (ret == -EOPNOTSUPP) ret = 0; + bio_put(bio); } blk_finish_plug(&plug); @@ -165,8 +166,10 @@ int blkdev_issue_write_same(struct block_device *bdev, sector_t sector, } } - if (bio) + if (bio) { ret = submit_bio_wait(REQ_WRITE | REQ_WRITE_SAME, bio); + bio_put(bio); + } return ret != -EOPNOTSUPP ? ret : 0; } EXPORT_SYMBOL(blkdev_issue_write_same); @@ -206,8 +209,11 @@ static int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, } } - if (bio) - return submit_bio_wait(WRITE, bio); + if (bio) { + ret = submit_bio_wait(WRITE, bio); + bio_put(bio); + return ret; + } return 0; } -- Shaun Tancheff -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html