Re: [PATCH v2] block: I/O error occurs during SATA disk stress test

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/25/22 00:09, Gu Mi wrote:
> The problem occurs in two async processes, One is when a new IO calls 
> the blk_mq_start_request() interface to start sending,The other is 
> that the block layer timer process calls the blk_mq_req_expired 
> interface to check whether there is an IO timeout.
> 
> When an instruction out of sequence occurs between blk_add_timer and 
> WRITE_ONCE(rq->state,MQ_RQ_IN_FLIGHT) in the interface 
> blk_mq_start_request,at this time, the block timer is checking the new 
> IO timeout, Since the req status has been set to MQ_RQ_IN_FLIGHT and 
> req->deadline is 0 at this time, the new IO will be misjudged as a 
> timeout.
> 
> Our repair plan is for the deadline to be 0, and we do not think that 
> a timeout occurs. At the same time, because the jiffies of the 32-bit 
> system will be reversed shortly after the system is turned on, we will 
> add 1 jiffies to the deadline at this time.
> 
> Signed-off-by: Gu Mi <gumi@xxxxxxxxxxxxxxxxx>
> ---
> v1->v2:
> 
> time_after_eq() can handle the overflow, so remove the change on 
> 32-bit in blk_add_timer().
> 
>   block/blk-mq.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c index 4b90d2d..6defaa1 
> 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1451,6 +1451,8 @@ static bool blk_mq_req_expired(struct request *rq, unsigned long *next)
>   		return false;
>   
>   	deadline = READ_ONCE(rq->deadline);
> +	if (unlikely(deadline == 0))
> +		return false;
>   	if (time_after_eq(jiffies, deadline))
>   		return true;
>   

rq->deadline == 0 can be a valid deadline value so the above patch
doesn't look right to me.

Thanks,

Bart.

---
in patch 1, I added another modification in blk_add_timer(). As follows,
> +#ifndef CONFIG_64BIT
> +/* In case INITIAL_JIFFIES wraps on 32-bit */
> +	expiry |= 1UL;

The purpose of this modification is to ensure that rq->deadline is 0 means that it is initialized to 0 in blk_mq_req_expired().
In this case, make sure rq->deadline is an invalid value in blk_mq_req_expired(). 
Please review my workaround again.

Thanks,

Gu Mi.




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux