On 1/25/23 05:32, Bart Van Assche wrote: > On 1/24/23 11:59, Keith Busch wrote: >> On Tue, Jan 24, 2023 at 11:29:10AM -0800, Bart Van Assche wrote: >>> On 1/24/23 11:02, Niklas Cassel wrote: >>>> Introduce the new block IO status BLK_STS_DURATION_LIMIT for LLDDs to >>>> report command that failed due to a command duration limit being >>>> exceeded. This new status is mapped to the ETIME error code to allow >>>> users to differentiate "soft" duration limit failures from other more >>>> serious hardware related errors. >>> >>> What makes exceeding the duration limit different from an I/O timeout >>> (BLK_STS_TIMEOUT)? Why is it important to tell the difference between an I/O >>> timeout and exceeding the command duration limit? >> >> BLK_STS_TIMEOUT should be used if the target device doesn't provide any >> response to the command. The DURATION_LIMIT status is used when the device >> completes a command with that status. > > Hi Keith, > > From SPC-6: "The MAX ACTIVE TIME field specifies an upper limit on the > time that elapses from the time at which the device server initiates > actions to access, transfer, or act upon the specified data until the > time the device server returns status for the command." > > My interpretation of the above text is that the SCSI command duration > limit specifies a hard limit, the same type of limit reported by the > status code BLK_STS_TIMEOUT. It is not clear to me from the patch > description why a new status code is needed for reporting that the > command duration limit has been exceeded. As explained, this allows differentiating the "drive gave a response" (BLK_STS_DURATION_LIMIT) from the "drive is not responding" case with BLK_STS_TIMEOUT. We took care of mapping BLK_STS_DURATION_LIMIT to ETIME (timer expired) for user space too, to not overload ETIMEDOUT used with BLK_STS_TIMEOUT. We can certainly improve the commit message to describe all of this in more details. > > Thanks, > > Bart. -- Damien Le Moal Western Digital Research