Re: v4.20-rc6: Sporadic use-after-free in bt_iter()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/19/18 5:16 PM, Bart Van Assche wrote:
> On Wed, 2018-12-19 at 16:27 -0700, Jens Axboe wrote:
>> On 12/19/18 4:24 PM, Bart Van Assche wrote:
>>> Hello,
>>>
>>> If I run the srp blktests in a loop then I see the below call stack appearing
>>> sporadically. I have not yet had the time to analyze this but I'm reporting
>>> this here in case someone else would already have had a look at this.
>>>
>>> Bart.
>>>
>>> ==================================================================
>>> BUG: KASAN: use-after-free in bt_iter+0x86/0xf0
>>> Read of size 8 at addr ffff88803b335240 by task fio/21412
>>>
>>> CPU: 0 PID: 21412 Comm: fio Tainted: G        W         4.20.0-rc6-dbg+ #3
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
>>> Call Trace:
>>>  dump_stack+0x86/0xca
>>>  print_address_description+0x71/0x239
>>>  kasan_report.cold.5+0x242/0x301
>>>  __asan_load8+0x54/0x90
>>>  bt_iter+0x86/0xf0
>>>  blk_mq_queue_tag_busy_iter+0x373/0x5e0
>>>  blk_mq_in_flight+0x96/0xb0
>>>  part_in_flight+0x40/0x140
>>>  part_round_stats+0x18e/0x370
>>>  blk_account_io_start+0x3d7/0x670
>>>  blk_mq_bio_to_request+0x19c/0x3a0
>>>  blk_mq_make_request+0x7a9/0xcb0
>>>  generic_make_request+0x41d/0x960
>>>  submit_bio+0x9b/0x250
>>>  do_blockdev_direct_IO+0x435c/0x4c70
>>>  __blockdev_direct_IO+0x79/0x88
>>>  ext4_direct_IO+0x46c/0xc00
>>>  generic_file_direct_write+0x119/0x210
>>>  __generic_file_write_iter+0x11c/0x280
>>>  ext4_file_write_iter+0x1b8/0x6f0
>>>  aio_write+0x204/0x310
>>>  io_submit_one+0x9d3/0xe80
>>>  __x64_sys_io_submit+0x115/0x340
>>>  do_syscall_64+0x71/0x210
>>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>> RIP: 0033:0x7f02cf043219
>>
>> I've seen this one before as well, it's not a new thing. As far as I can
>> tell, it's a false positive. There should be no possibility for a
>> use-after-free iterating the static tags/requests.
> 
> Are you sure this is a false positive?

No I'm not, but the few times I have seen it, I haven't been able to
make much sense of it. It goes back quite a bit.

I have not yet encountered any false
> positive KASAN complaints. According to the following gdb output this complaint
> refers to reading rq->q:
> 
> (gdb) list *(bt_iter+0x86)
> 0xffffffff816b9346 is in bt_iter (block/blk-mq-tag.c:237).
> 232
> 233             /*
> 234              * We can hit rq == NULL here, because the tagging functions
> 235              * test and set the bit before assigning ->rqs[].
> 236              */
> 237             if (rq && rq->q == hctx->queue)
> 238                     iter_data->fn(hctx, rq, iter_data->data, reserved);
> 239             return true;
> 240     }
> 241
> 
> From the disassembly output:
> 
> 232
> 233             /*
> 234              * We can hit rq == NULL here, because the tagging functions
> 235              * test and set the bit before assigning ->rqs[].
> 236              */
> 237             if (rq && rq->q == hctx->queue)
>    0xffffffff816b9339 <+121>:   test   %r12,%r12
>    0xffffffff816b933c <+124>:   je     0xffffffff816b935f <bt_iter+159>
>    0xffffffff816b933e <+126>:   mov    %r12,%rdi
>    0xffffffff816b9341 <+129>:   callq  0xffffffff813bd3e0 <__asan_load8>
>    0xffffffff816b9346 <+134>:   lea    0x138(%r13),%rdi
>    0xffffffff816b934d <+141>:   mov    (%r12),%r14
>    0xffffffff816b9351 <+145>:   callq  0xffffffff813bd3e0 <__asan_load8>
>    0xffffffff816b9356 <+150>:   cmp    0x138(%r13),%r14
>    0xffffffff816b935d <+157>:   je     0xffffffff816b936f <bt_iter+175>
> 
> BTW, rq may but does not have to refer to tags->static_rqs[...]. It may also
> refer to hctx->fq.flush_rq.

But even those are persistent for the lifetime of the queue... But since
kasan complains it belongs to a specific page, I'm guessing it's one
of the regular requests since those are out of a chopped up page. Which
means it makes even less sense.

Is this happening while devices are being actively torn down? And
are you using shared tags? That's the only way I could see this
triggering.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux