Re: [PATCH 1/2] blk-mq: introduce blk_mq_complete_request_sync()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/17/2019 9:09 PM, Bart Van Assche wrote:
On 3/17/19 8:29 PM, Ming Lei wrote:
In NVMe's error handler, follows the typical steps for tearing down
hardware:

1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
    blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
    mark the request as abort
    blk_mq_complete_request(req);
4) destroy real hw queues

However, there may be race between #3 and #4, because blk_mq_complete_request()
actually completes the request asynchronously.

This patch introduces blk_mq_complete_request_sync() for fixing the
above race.

Other block drivers wait until outstanding requests have completed by calling blk_cleanup_queue() before hardware queues are destroyed. Why can't the NVMe driver follow that approach?


speaking for the fabrics, not necessarily pci:

The intent of this looping, which happens immediately following an error being detected, is to cause the termination of the outstanding requests. Otherwise, the only recourse is to wait for the ios to finish, which they may never do, or have their upper-level timeout expire to cause their termination - thus a very long delay.   And one of the commands, on the admin queue - a different tag set but handled the same, doesn't have a timeout (the Async Event Reporting command) so it wouldn't necessarily clear without this looping.

-- james




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux