Re: [RFC PATCH] blk-mq: Fix lost request during timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 19, 2017 at 11:07 PM, Keith Busch <keith.busch@xxxxxxxxx> wrote:
> On Tue, Sep 19, 2017 at 12:16:31PM +0800, Ming Lei wrote:
>> On Tue, Sep 19, 2017 at 7:08 AM, Keith Busch <keith.busch@xxxxxxxxx> wrote:
>> >
>> > Indeed that prevents .complete from running concurrently with the
>> > timeout handler, but scsi_mq_done and nvme_handle_cqe are not .complete
>> > callbacks. These are the LLD functions that call blk_mq_complete_request
>> > well before .complete. If the driver calls blk_mq_complete_request on
>> > a request that blk-mq is timing out, the request is lost because blk-mq
>> > already called blk_mark_rq_complete. Nothing prevents these LLD functions
>>
>> That shouldn't happen because only one blk_mark_rq_complete() can win
>> and it is the winner's responsibility to complete the request, so
>> there shouldn't
>> be request lost. Especially in your case, it is the responsibility of timeout
>> to complete the rq really.
>
> Hm, either I'm explaining this poorly, or I'm missing something that's
> obvious to everyone else.
>
> The driver's IRQ handler has no idea it's racing with the blk-mq timeout
> handler, and there's nothing indicating it lost the race. The IRQ handler
> just calls blk_mq_complete_request. As far as the driver is concerned,
> it has done its part to complete the request at that point.

Both blk_mark_rq_complete() and blk_mq_check_expired() calls
blk_mark_rq_complete() to try to complete the req, but only
one of them can do that actually, right?

>
> The problem is when blk-mq's timeout handler prevents the request from
> completing, and doesn't leave any indication the driver requested to
> complete it. Who is responsible for completing that request now?

Who sets ATOM_COMPLETE successfully is responsible for completing
the request. In this case it should be timeout handler, and irq handler
shouldn't touch the request any more, otherwise use-after-free may
happen.


-- 
Ming Lei



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux