Re: [PATCHv3 2/3] scsi: Do not rely on blk-mq for double completions

Bart Van Assche <bvanassche@xxxxxxx> · Fri, 16 Nov 2018 06:46:40 -0800

On Fri, 2018-11-16 at 10:53 +-0100, Hannes Reinecke wrote:
+AD4 On 11/15/18 6:58 PM, Keith Busch wrote:
+AD4 +AD4 The scsi timeout error handling had been directly updating the block
+AD4 +AD4 layer's request state to prevent a error handling and a natural completion
+AD4 +AD4 from completing the same request twice. Fix this layering violation
+AD4 +AD4 by having scsi control the fate of its commands with scsi owned flags
+AD4 +AD4 rather than use blk-mq's.
+AD4 +AD4 
+AD4 +AD4 Signed-off-by: Keith Busch +ADw-keith.busch+AEA-intel.com+AD4
+AD4 +AD4 ---
+AD4 +AD4   drivers/scsi/scsi+AF8-error.c +AHw 22 +-+-+-+-+-+-+-+-+-+-+------------
+AD4 +AD4   drivers/scsi/scsi+AF8-lib.c   +AHw  6 +-+-+-+-+--
+AD4 +AD4   include/scsi/scsi+AF8-cmnd.h  +AHw  5 +-+-+-+--
+AD4 +AD4   3 files changed, 20 insertions(+-), 13 deletions(-)
+AD4 +AD4 
+AD4 +AD4 diff --git a/drivers/scsi/scsi+AF8-error.c b/drivers/scsi/scsi+AF8-error.c
+AD4 +AD4 index dd338a8cd275..e92e088f636f 100644
+AD4 +AD4 --- a/drivers/scsi/scsi+AF8-error.c
+AD4 +AD4 +-+-+- b/drivers/scsi/scsi+AF8-error.c
+AD4 +AD4 +AEAAQA -297,19 +-297,19 +AEAAQA enum blk+AF8-eh+AF8-timer+AF8-return scsi+AF8-times+AF8-out(struct request +ACo-req)
+AD4 +AD4   
+AD4 +AD4   	if (rtn +AD0APQ BLK+AF8-EH+AF8-DONE) +AHs
+AD4 +AD4   		/+ACo
+AD4 +AD4 -		 +ACo For blk-mq, we must set the request state to complete now
+AD4 +AD4 -		 +ACo before sending the request to the scsi error handler. This
+AD4 +AD4 -		 +ACo will prevent a use-after-free in the event the LLD manages
+AD4 +AD4 -		 +ACo to complete the request before the error handler finishes
+AD4 +AD4 -		 +ACo processing this timed out request.
+AD4 +AD4 +-		 +ACo Set the command to complete first in order to prevent a real
+AD4 +AD4 +-		 +ACo completion from releasing the command while error handling
+AD4 +AD4 +-		 +ACo is using it. If the command was already completed, then the
+AD4 +AD4 +-		 +ACo lower level driver beat the timeout handler, and it is safe
+AD4 +AD4 +-		 +ACo to return without escalating error recovery.
+AD4 +AD4   		 +ACo
+AD4 +AD4 -		 +ACo If the request was already completed, then the LLD beat the
+AD4 +AD4 -		 +ACo time out handler from transferring the request to the scsi
+AD4 +AD4 -		 +ACo error handler. In that case we can return immediately as no
+AD4 +AD4 -		 +ACo further action is required.
+AD4 +AD4 +-		 +ACo If timeout handling lost the race to a real completion, the
+AD4 +AD4 +-		 +ACo block layer may ignore that due to a fake timeout injection,
+AD4 +AD4 +-		 +ACo so return RESET+AF8-TIMER to allow error handling another shot
+AD4 +AD4 +-		 +ACo at this command.
+AD4 +AD4   		 +ACo-/
+AD4 +AD4 -		if (+ACE-blk+AF8-mq+AF8-mark+AF8-complete(req))
+AD4 +AD4 -			return rtn+ADs
+AD4 +AD4 +-		if (test+AF8-and+AF8-set+AF8-bit(+AF8AXw-SCMD+AF8-COMPLETE, +ACY-scmd-+AD4-flags))
+AD4 +AD4 +-			return BLK+AF8-EH+AF8-RESET+AF8-TIMER+ADs
+AD4 +AD4   		if (scsi+AF8-abort+AF8-command(scmd) +ACEAPQ SUCCESS) +AHs
+AD4 +AD4   			set+AF8-host+AF8-byte(scmd, DID+AF8-TIME+AF8-OUT)+ADs
+AD4 +AD4   			scsi+AF8-eh+AF8-scmd+AF8-add(scmd)+ADs
+AD4 +AD4 diff --git a/drivers/scsi/scsi+AF8-lib.c b/drivers/scsi/scsi+AF8-lib.c
+AD4 +AD4 index 5d83a162d03b..c1d5e4e36125 100644
+AD4 +AD4 --- a/drivers/scsi/scsi+AF8-lib.c
+AD4 +AD4 +-+-+- b/drivers/scsi/scsi+AF8-lib.c
+AD4 +AD4 +AEAAQA -1635,8 +-1635,11 +AEAAQA static blk+AF8-status+AF8-t scsi+AF8-mq+AF8-prep+AF8-fn(struct request +ACo-req)
+AD4 +AD4   
+AD4 +AD4   static void scsi+AF8-mq+AF8-done(struct scsi+AF8-cmnd +ACo-cmd)
+AD4 +AD4   +AHs
+AD4 +AD4 +-	if (unlikely(test+AF8-and+AF8-set+AF8-bit(+AF8AXw-SCMD+AF8-COMPLETE, +ACY-cmd-+AD4-flags)))
+AD4 +AD4 +-		return+ADs
+AD4 +AD4   	trace+AF8-scsi+AF8-dispatch+AF8-cmd+AF8-done(cmd)+ADs
+AD4 +AD4 -	blk+AF8-mq+AF8-complete+AF8-request(cmd-+AD4-request)+ADs
+AD4 +AD4 +-	if (unlikely(+ACE-blk+AF8-mq+AF8-complete+AF8-request(cmd-+AD4-request)))
+AD4 +AD4 +-		clear+AF8-bit(+AF8AXw-SCMD+AF8-COMPLETE, +ACY-cmd-+AD4-flags)+ADs
+AD4 +AD4   +AH0
+AD4 +AD4   
+AD4 +AD4   static void scsi+AF8-mq+AF8-put+AF8-budget(struct blk+AF8-mq+AF8-hw+AF8-ctx +ACo-hctx)
+AD4 +AD4 +AEAAQA -1701,6 +-1704,7 +AEAAQA static blk+AF8-status+AF8-t scsi+AF8-queue+AF8-rq(struct blk+AF8-mq+AF8-hw+AF8-ctx +ACo-hctx,
+AD4 +AD4   			goto out+AF8-dec+AF8-host+AF8-busy+ADs
+AD4 +AD4   		req-+AD4-rq+AF8-flags +AHwAPQ RQF+AF8-DONTPREP+ADs
+AD4 +AD4   	+AH0 else +AHs
+AD4 +AD4 +-		cmd-+AD4-flags +ACYAPQ +AH4-SCMD+AF8-COMPLETE+ADs
+AD4 +AD4   		blk+AF8-mq+AF8-start+AF8-request(req)+ADs
+AD4 +AD4   	+AH0
+AD4 +AD4   
+AD4 
+AD4 Why do you use a normal assignment here (and not a clear+AF8-bit() as in the 
+AD4 above hunk)?
+AD4 
+AD4 And shouldn't you add a barrier here to broadcast the state change to 
+AD4 other cores?

Although I don't like it that request state information is being moved from 
the block layer core into the SCSI core, I think that this patch is fine.
scsi+AF8-queue+AF8-rq() is only called after a request structure has been recycled
so I don't think that there can be any kind of race between the code paths
that use atomic instructions to manipulate the +AF8AXw-SCMD+AF8-COMPLETE bit and
scsi+AF8-queue+AF8-rq().

Bart.