Re: [PATCH 2/2] scsi: set timed out out mq requests to complete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2018-07-20 at 10:23 -0600, Keith Busch wrote:
+AD4- On Fri, Jul 20, 2018 at 04:20:01PM +-0000, Bart Van Assche wrote:
+AD4- +AD4- On Fri, 2018-07-20 at 10:12 -0600, Keith Busch wrote:
+AD4- +AD4- +AD4- On Fri, Jul 20, 2018 at 04:03:18PM +-0000, Bart Van Assche wrote:
+AD4- +AD4- +AD4- +AD4- On Fri, 2018-07-20 at 09:56 -0600, Keith Busch wrote:
+AD4- +AD4- +AD4- +AD4- +AD4- SCSI is the only block driver that wants this behavior. Moving it back
+AD4- +AD4- +AD4- +AD4- +AD4- to generic where it used to be breaks other block drivers.
+AD4- +AD4- +AD4- +AD4- 
+AD4- +AD4- +AD4- +AD4- That's new to me. What would break in the NVMe driver if the above change would be
+AD4- +AD4- +AD4- +AD4- present in the block layer?
+AD4- +AD4- +AD4- 
+AD4- +AD4- +AD4- This is what causes the block layer to lose completions, and most drivers
+AD4- +AD4- +AD4- don't want the kernel to lose their completions.
+AD4- +AD4- 
+AD4- +AD4- Hello Keith,
+AD4- +AD4- 
+AD4- +AD4- Have you considered to introduce a fourth state for block layer requests to
+AD4- +AD4- avoid that completions that occur while a timeout handler is in progress get
+AD4- +AD4- lost? That would avoid that completions get lost not only for the NVMe driver
+AD4- +AD4- but also for SCSI drivers. See e.g. the MQ+AF8-RQ+AF8-TIMED+AF8-OUT state in
+AD4- +AD4- https://www.mail-archive.com/linux-block+AEA-vger.kernel.org/msg22196.html
+AD4- 
+AD4- Yes, I've considered that, and I really want to use it, but scsi may
+AD4- still reference a freed request in scmd+AF8-eh+AF8-abort+AF8-handler that way.

I think that's a misunderstanding. If scsi+AF8-times+AF8-out() queues an abort
asynchronously then it tells the block layer through its return value that the
SCSI core still owns the request and hence that the block layer should ignore any
completions that occur until the SCSI core calls scsi+AF8-finish+AF8-command(). That
scsi+AF8-finish+AF8-command() will trigger a call to +AF8AXw-blk+AF8-mq+AF8-end+AF8-request(). The
scsi+AF8-times+AF8-out() return value I was referring to is called BLK+AF8-EH+AF8-DONE today and
was called BLK+AF8-EH+AF8-NOT+AF8-HANDLED in kernel version v4.17.

This also means that I got the BLK+AF8-EH+AF8-NOT+AF8-HANDLED case wrong in +ACI-blk-mq: Rework
blk-mq timeout handling again+ACI-: in that case concurrent a blk+AF8-mq+AF8-complete+AF8-request()
call should be ignored instead of triggering request completion.

Bart.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux