On 2020-04-17 05:51, yu kuai wrote: > I recently got a KASAN warning like this in our 4.19 kernel: > > ================================================================== > BUG: KASAN: slab-out-of-bounds in bt_for_each+0x1dc/0x2c0 > Read of size 8 at addr ffff8000c0865000 by task sh/2023305 > > Call trace: > dump_backtrace+0x0/0x310 > show_stack+0x28/0x38 > dump_stack+0xd8/0x108 > print_address_description+0x68/0x2d0 > kasan_report+0x124/0x2e0 > __asan_load8+0x88/0xb0 > bt_for_each+0x1dc/0x2c0 > blk_mq_queue_tag_busy_iter+0x1f0/0x3e8 > blk_mq_in_flight+0xb4/0xe0 > part_in_flight+0x124/0x178 > part_round_stats+0x128/0x3b0 > blk_account_io_start+0x2b4/0x3f0 > blk_mq_bio_to_request+0x170/0x258 > blk_mq_make_request+0x734/0xdd8 > generic_make_request+0x388/0x740 > submit_bio+0xd8/0x3d0 > ext4_io_submit+0xb4/0xe0 [ext4] > ext4_writepages+0xb44/0x1c00 [ext4] > do_writepages+0xc8/0x1f8 > __filemap_fdatawrite_range+0x200/0x2a0 > filemap_flush+0x30/0x40 > ext4_alloc_da_blocks+0x54/0x200 [ext4] > ext4_release_file+0xfc/0x150 [ext4] > __fput+0x15c/0x3a8 > ____fput+0x24/0x30 > task_work_run+0x1a4/0x208 > do_notify_resume+0x1a8/0x1c0 > work_pending+0x8/0x10 > > Allocated by task 3515778: > kasan_kmalloc+0xe0/0x190 > kmem_cache_alloc_trace+0x18c/0x418 > alloc_pipe_info+0x74/0x240 > create_pipe_files+0x74/0x2f8 > __do_pipe_flags+0x48/0x168 > do_pipe2+0xa0/0x1d0 > __arm64_sys_pipe2+0x3c/0x50 > el0_svc_common+0xb4/0x1d8 > el0_svc_handler+0x50/0xa8 > el0_svc+0x8/0xc > > Freed by task 3515778: > __kasan_slab_free+0x120/0x228 > kasan_slab_free+0x10/0x18 > kfree+0x88/0x3d8 > free_pipe_info+0x150/0x178 > put_pipe_info+0x138/0x1c0 > pipe_release+0xe8/0x120 > __fput+0x15c/0x3a8 > ____fput+0x24/0x30 > task_work_run+0x1a4/0x208 > do_notify_resume+0x1a8/0x1c0 > work_pending+0x8/0x10 The alloc/free info refers to a data structure owned by the pipe implementation. The use-after-free report refers to a data structure owned by the block layer. How can that report make sense? > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 7ed16ed13976..48b74d0085c7 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -485,6 +485,7 @@ static void __blk_mq_free_request(struct request *rq) > struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, ctx->cpu); > const int sched_tag = rq->internal_tag; > > + hctx->tags->rqs[rq->tag] = NULL; > if (rq->tag != -1) > blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag); > if (sched_tag != -1) Can the above change trigger the following assignment? hctx->tags->rqs[-1] = NULL? > @@ -1999,7 +2000,7 @@ struct blk_mq_tags *blk_mq_alloc_rq_map(struct blk_mq_tag_set *set, > if (!tags) > return NULL; > > - tags->rqs = kcalloc_node(nr_tags, sizeof(struct request *), > + tags->rqs = kzalloc_node(nr_tags, sizeof(struct request *), > GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY, > node); >From include/linux/slab.h: static inline void *kcalloc_node(size_t n, size_t size, gfp_t flags, int node) { return kmalloc_array_node(n, size, flags | __GFP_ZERO, node); } I think this means that kcalloc_node() already zeroes the allocated memory and hence that changing kcalloc() into kzalloc() is not necessary. > if (!tags->rqs) { > diff --git a/block/blk-mq.h b/block/blk-mq.h > index a6094c27b827..2a55292d3d51 100644 > --- a/block/blk-mq.h > +++ b/block/blk-mq.h > @@ -196,6 +196,7 @@ static inline void blk_mq_put_driver_tag_hctx(struct blk_mq_hw_ctx *hctx, > if (rq->tag == -1 || rq->internal_tag == -1) > return; > > + hctx->tags->rqs[rq->tag] = NULL; > __blk_mq_put_driver_tag(hctx, rq); > } > > @@ -207,6 +208,7 @@ static inline void blk_mq_put_driver_tag(struct request *rq) > return; > > hctx = blk_mq_map_queue(rq->q, rq->mq_ctx->cpu); > + hctx->tags->rqs[rq->tag] = NULL; > __blk_mq_put_driver_tag(hctx, rq); > } I don't think the above changes are sufficient to fix the use-after-free. Has it been considered to free the memory that backs tags->bitmap_tags only after an RCU grace period has expired? See also blk_mq_free_tags(). Thanks, Bart.