Re: [PATCH V2] block: ignore RWF_HIPRI hint for sync dio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 23, 2022 at 09:02:39PM -0600, Keith Busch wrote:
> On Tue, May 24, 2022 at 08:40:46AM +0800, Ming Lei wrote:
> > On Mon, May 23, 2022 at 04:36:04PM -0600, Keith Busch wrote:
> > > On Wed, Apr 20, 2022 at 10:31:10PM +0800, Ming Lei wrote:
> > > > So far bio is marked as REQ_POLLED if RWF_HIPRI/IOCB_HIPRI is passed
> > > > from userspace sync io interface, then block layer tries to poll until
> > > > the bio is completed. But the current implementation calls
> > > > blk_io_schedule() if bio_poll() returns 0, and this way causes io hang or
> > > > timeout easily.
> > > 
> > > Wait a second. The task's current state is TASK_RUNNING when bio_poll() returns
> > > zero, so calling blk_io_schedule() isn't supposed to hang.
> > 
> > void __sched io_schedule(void)
> > {
> >         int token;
> > 
> >         token = io_schedule_prepare();
> >         schedule();
> >         io_schedule_finish(token);
> > }
> > 
> > But who can wakeup this task after scheduling out? There can't be irq
> > handler for POLLED request.
> 
> No one. If everything was working, the task state would be RUNNING, so it is
> immediately available to be scheduled back in.
>  
> > The hang can be triggered on nvme/qemu reliably:
> 
> And clearly it's not working, but for a different reason. The polling thread
> sees an invalid cookie and fails to set the task back to RUNNING, so yes, it
> will sleep with no waker in the current code.
> 
> We usually expect the cookie to be set inline with submit_bio(), but we're not
> guaranteed it won't be punted off to .run_work for a variety of reasons, so
> the thread writing the cookie may be racing with the reader.
> 
> This was fine before the bio polling support since the cookie was always
> returned with submit_bio() before that.
> 
> And I would like psync to continue working with polling. As great as io_uring
> is, it's just not as efficient @qd1.
> 
> Here's a bandaid, though I assume it'll break something...
> 
> ---
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index ed1869a305c4..348136dc7ba9 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1146,8 +1146,6 @@ void blk_mq_start_request(struct request *rq)
>  	if (blk_integrity_rq(rq) && req_op(rq) == REQ_OP_WRITE)
>  		q->integrity.profile->prepare_fn(rq);
>  #endif
> -	if (rq->bio && rq->bio->bi_opf & REQ_POLLED)
> -	        WRITE_ONCE(rq->bio->bi_cookie, blk_rq_to_qc(rq));
>  }
>  EXPORT_SYMBOL(blk_mq_start_request);
>  
> @@ -2464,6 +2462,9 @@ static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
>  	WARN_ON_ONCE(err);
>  
>  	blk_account_io_start(rq);
> +
> +	if (rq->bio->bi_opf & REQ_POLLED)
> +	        WRITE_ONCE(rq->bio->bi_cookie, blk_rq_to_qc(rq));
>  }

Yeah, this way may improve the situation, but still can't fix all,
what if the submitted bio is merged to another request in plug list?


Thanks,
Ming




[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux