Re: [PATCH] block, bfq: NULL out the bic when it's no longer valid

Paolo Valente <paolo.valente@xxxxxxxxxx> · Fri, 28 Jun 2019 09:51:28 +0200

> Il giorno 28 giu 2019, alle ore 06:57, Guenter Roeck <linux@xxxxxxxxxxxx> ha scritto:
> 
> On Thu, Jun 27, 2019 at 09:44:09PM -0700, Douglas Anderson wrote:
>> In reboot tests on several devices we were seeing a "use after free"
>> when slub_debug or KASAN was enabled.  The kernel complained about:
>> 
>>  Unable to handle kernel paging request at virtual address 6b6b6c2b
>> 
>> ...which is a classic sign of use after free under slub_debug. The
>> stack crawl in kgdb looked like:
>> 
>> 0  test_bit (addr=<optimized out>, nr=<optimized out>)
>> 1  bfq_bfqq_busy (bfqq=<optimized out>)
>> 2  bfq_select_queue (bfqd=<optimized out>)
>> 3  __bfq_dispatch_request (hctx=<optimized out>)
>> 4  bfq_dispatch_request (hctx=<optimized out>)
>> 5  0xc056ef00 in blk_mq_do_dispatch_sched (hctx=0xed249440)
>> 6  0xc056f728 in blk_mq_sched_dispatch_requests (hctx=0xed249440)
>> 7  0xc0568d24 in __blk_mq_run_hw_queue (hctx=0xed249440)
>> 8  0xc0568d94 in blk_mq_run_work_fn (work=<optimized out>)
>> 9  0xc024c5c4 in process_one_work (worker=0xec6d4640, work=0xed249480)
>> 10 0xc024cff4 in worker_thread (__worker=0xec6d4640)
>> 
>> Digging in kgdb, it could be found that, though bfqq looked fine,
>> bfqq->bic had been freed.
>> 
>> Through further digging, I postulated that perhaps it is illegal to
>> access a "bic" (AKA an "icq") after bfq_exit_icq() had been called
>> because the "bic" can be freed at some point in time after this call
>> is made.  I confirmed that there certainly were cases where the exact
>> crashing code path would access the "bic" after bfq_exit_icq() had
>> been called.  Sspecifically I set the "bfqq->bic" to (void *)0x7 and
>                ^^^
> 
>> saw that the bic was 0x7 at the time of the crash.
>> 
>> To understand a bit more about why this crash was fairly uncommon (I
>> saw it only once in a few hundred reboots), you can see that much of
>> the time bfq_exit_icq_fbqq() fully frees the bfqq and thus it can't
>> access the ->bic anymore.  The only case it doesn't is if
>> bfq_put_queue() sees a reference still held.
>> 
>> However, even in the case when bfqq isn't freed, the crash is still
>> rare.  Why?  I tracked what happened to the "bic" after the exit
>> routine.  It doesn't get freed right away.  Rather,
>> put_io_context_active() eventually called put_io_context() which
>> queued up freeing on a workqueue.  The freeing then actually happened
>> later than that through call_rcu().  Despite all these delays, some
>> extra debugging showed that all the hoops could be jumped through in
>> time and the memory could be freed causing the original crash. Phew!
>> 
>> To make a long story short, assuming it truly is illegal to access an
>> icq after the "exit_icq" callback is finished, this patch is needed.
>> 

Nice work! I think this bug has been there since when BFQ was born ...

Reviewed-by: Paolo Valente <paolo.valente@xxxxxxxxxx>

Thanks,
Paolo

>> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
> 
> Nicely done ... thanks!
> 
> Reviewed-by: Guenter Roeck <groeck@xxxxxxxxxxxx>
> 
>> ---
>> Most of the testing of this was done on the Chrome OS 4.19 kernel with
>> BFQ backported (thanks to Paolo's help).  I did manage to reproduce a
>> crash on mainline Linux (v5.2-rc6) though.
>> 
>> To see some of the techniques used to debug this, see
>> <https://crrev.com/c/1679134> and <https://crrev.com/c/1681258/1>.
>> 
>> I'll also note that on linuxnext (next-20190627) I saw some other
>> use-after-frees that seemed related to BFQ but haven't had time to
>> debug.  They seemed unrelated.
>> 
>> block/bfq-iosched.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index f8d430f88d25..6c0cff03f8f6 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -4584,6 +4584,7 @@ static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync)
>> 		unsigned long flags;
>> 
>> 		spin_lock_irqsave(&bfqd->lock, flags);
>> +		bfqq->bic = NULL;
>> 		bfq_exit_bfqq(bfqd, bfqq);
>> 		bic_set_bfqq(bic, NULL, is_sync);
>> 		spin_unlock_irqrestore(&bfqd->lock, flags);
>> -- 
>> 2.22.0.410.gd8fdbe21b5-goog