Re: v4.20-rc6: Sporadic use-after-free in bt_iter()

Jens Axboe <axboe@xxxxxxxxx> · Wed, 19 Dec 2018 21:19:29 -0700

On 12/19/18 8:24 PM, jianchao.wang wrote:
> 
> 
> On 12/20/18 11:17 AM, Jens Axboe wrote:
>> On 12/19/18 5:16 PM, Bart Van Assche wrote:
>>> On Wed, 2018-12-19 at 16:27 -0700, Jens Axboe wrote:
>>>> On 12/19/18 4:24 PM, Bart Van Assche wrote:
>>>>> Hello,
>>>>>
>>>>> If I run the srp blktests in a loop then I see the below call stack appearing
>>>>> sporadically. I have not yet had the time to analyze this but I'm reporting
>>>>> this here in case someone else would already have had a look at this.
>>>>>
>>>>> Bart.
>>>>>
>>>>> ==================================================================
>>>>> BUG: KASAN: use-after-free in bt_iter+0x86/0xf0
>>>>> Read of size 8 at addr ffff88803b335240 by task fio/21412
>>>>>
>>>>> CPU: 0 PID: 21412 Comm: fio Tainted: G        W         4.20.0-rc6-dbg+ #3
>>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>>>>> Call Trace:
>>>>>  dump_stack+0x86/0xca
>>>>>  print_address_description+0x71/0x239
>>>>>  kasan_report.cold.5+0x242/0x301
>>>>>  __asan_load8+0x54/0x90
>>>>>  bt_iter+0x86/0xf0
>>>>>  blk_mq_queue_tag_busy_iter+0x373/0x5e0
>>>>>  blk_mq_in_flight+0x96/0xb0
>>>>>  part_in_flight+0x40/0x140
>>>>>  part_round_stats+0x18e/0x370
>>>>>  blk_account_io_start+0x3d7/0x670
>>>>>  blk_mq_bio_to_request+0x19c/0x3a0
>>>>>  blk_mq_make_request+0x7a9/0xcb0
>>>>>  generic_make_request+0x41d/0x960
>>>>>  submit_bio+0x9b/0x250
>>>>>  do_blockdev_direct_IO+0x435c/0x4c70
>>>>>  __blockdev_direct_IO+0x79/0x88
>>>>>  ext4_direct_IO+0x46c/0xc00
>>>>>  generic_file_direct_write+0x119/0x210
>>>>>  __generic_file_write_iter+0x11c/0x280
>>>>>  ext4_file_write_iter+0x1b8/0x6f0
>>>>>  aio_write+0x204/0x310
>>>>>  io_submit_one+0x9d3/0xe80
>>>>>  __x64_sys_io_submit+0x115/0x340
>>>>>  do_syscall_64+0x71/0x210
>>>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>>>> RIP: 0033:0x7f02cf043219
>>>>
>>>> I've seen this one before as well, it's not a new thing. As far as I can
>>>> tell, it's a false positive. There should be no possibility for a
>>>> use-after-free iterating the static tags/requests.
>>>
>>> Are you sure this is a false positive?
>>
>> No I'm not, but the few times I have seen it, I haven't been able to
>> make much sense of it. It goes back quite a bit.
>>
>> I have not yet encountered any false
>>> positive KASAN complaints. According to the following gdb output this complaint
>>> refers to reading rq->q:
>>>
>>> (gdb) list *(bt_iter+0x86)
>>> 0xffffffff816b9346 is in bt_iter (block/blk-mq-tag.c:237).
>>> 232
>>> 233             /*
>>> 234              * We can hit rq == NULL here, because the tagging functions
>>> 235              * test and set the bit before assigning ->rqs[].
>>> 236              */
>>> 237             if (rq && rq->q == hctx->queue)
>>> 238                     iter_data->fn(hctx, rq, iter_data->data, reserved);
>>> 239             return true;
>>> 240     }
>>> 241
>>>
>>> From the disassembly output:
>>>
>>> 232
>>> 233             /*
>>> 234              * We can hit rq == NULL here, because the tagging functions
>>> 235              * test and set the bit before assigning ->rqs[].
>>> 236              */
>>> 237             if (rq && rq->q == hctx->queue)
>>>    0xffffffff816b9339 <+121>:   test   %r12,%r12
>>>    0xffffffff816b933c <+124>:   je     0xffffffff816b935f <bt_iter+159>
>>>    0xffffffff816b933e <+126>:   mov    %r12,%rdi
>>>    0xffffffff816b9341 <+129>:   callq  0xffffffff813bd3e0 <__asan_load8>
>>>    0xffffffff816b9346 <+134>:   lea    0x138(%r13),%rdi
>>>    0xffffffff816b934d <+141>:   mov    (%r12),%r14
>>>    0xffffffff816b9351 <+145>:   callq  0xffffffff813bd3e0 <__asan_load8>
>>>    0xffffffff816b9356 <+150>:   cmp    0x138(%r13),%r14
>>>    0xffffffff816b935d <+157>:   je     0xffffffff816b936f <bt_iter+175>
>>>
>>> BTW, rq may but does not have to refer to tags->static_rqs[...]. It may also
>>> refer to hctx->fq.flush_rq.
>>
>> But even those are persistent for the lifetime of the queue... But since
>> kasan complains it belongs to a specific page, I'm guessing it's one
>> of the regular requests since those are out of a chopped up page. Which
>> means it makes even less sense.
>>
>> Is this happening while devices are being actively torn down? And
>> are you using shared tags? That's the only way I could see this
>> triggering.
>>
> 
> Or could it be caused by the stale request in hctx->tags->rqs[] slot ?
> We don't clear it after free the requests.
> 
> And there could be a scenario like,
> There used to be a io scheduler attached.
> After some workload, the io scheduler is detached.
> So there could be rqs allocated by the io scheduler left in hctx->tags->rqs.
> 
> blk_mq_get_request                            blk_mq_queue_tag_busy_iter
>   -> blk_mq_get_tag
>                                                 -> bt_for_each
>                                                   -> bt_iter
>                                                     -> rq = taags->rqs[]
>                                                     -> rq->q
>   -> blk_mq_rq_ctx_init
>     -> data->hctx->tags->rqs[rq->tag] = rq;
> 
> If the scenario is possible, maybe we could fix it as following.

Ah yes, good point, I bet that's what it is. But we just had this exact
discussion in another thread, and my point there was that we should
clear these when they go away, not inline. So how about clearing entries
when the sched tags go away?

-- 
Jens Axboe