Re: [PATCH v2 05/14] ublk: don't hard code IO_URING_F_UNLOCKED

Ming Lei <ming.lei@xxxxxxxxxx> · Mon, 18 Mar 2024 22:39:30 +0800

On Mon, Mar 18, 2024 at 02:32:16PM +0000, Pavel Begunkov wrote:
> On 3/18/24 13:37, Pavel Begunkov wrote:
> > On 3/18/24 12:52, Pavel Begunkov wrote:
> > > On 3/18/24 08:16, Ming Lei wrote:
> > > > On Mon, Mar 18, 2024 at 12:41:50AM +0000, Pavel Begunkov wrote:
> > > > > uring_cmd implementations should not try to guess issue_flags, just use
> > > > > a newly added io_uring_cmd_complete(). We're loosing an optimisation in
> > > > > the cancellation path in ublk_uring_cmd_cancel_fn(), but the assumption
> > > > > is that we don't care that much about it.
> > > > > 
> > > > > Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
> > > > > Link: https://lore.kernel.org/r/2f7bc9fbc98b11412d10b8fd88e58e35614e3147.1710514702.git.asml.silence@xxxxxxxxx
> > > > > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
> > > > > ---
> > > > >   drivers/block/ublk_drv.c | 18 ++++++++----------
> > > > >   1 file changed, 8 insertions(+), 10 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> > > > > index bea3d5cf8a83..97dceecadab2 100644
> > > > > --- a/drivers/block/ublk_drv.c
> > > > > +++ b/drivers/block/ublk_drv.c
> > > > > @@ -1417,8 +1417,7 @@ static bool ublk_abort_requests(struct ublk_device *ub, struct ublk_queue *ubq)
> > > > >       return true;
> > > > >   }
> > > > > -static void ublk_cancel_cmd(struct ublk_queue *ubq, struct ublk_io *io,
> > > > > -        unsigned int issue_flags)
> > > > > +static void ublk_cancel_cmd(struct ublk_queue *ubq, struct ublk_io *io)
> > > > >   {
> > > > >       bool done;
> > > > > @@ -1432,15 +1431,14 @@ static void ublk_cancel_cmd(struct ublk_queue *ubq, struct ublk_io *io,
> > > > >       spin_unlock(&ubq->cancel_lock);
> > > > >       if (!done)
> > > > > -        io_uring_cmd_done(io->cmd, UBLK_IO_RES_ABORT, 0, issue_flags);
> > > > > +        io_uring_cmd_complete(io->cmd, UBLK_IO_RES_ABORT, 0);
> > > > >   }
> > > > >   /*
> > > > >    * The ublk char device won't be closed when calling cancel fn, so both
> > > > >    * ublk device and queue are guaranteed to be live
> > > > >    */
> > > > > -static void ublk_uring_cmd_cancel_fn(struct io_uring_cmd *cmd,
> > > > > -        unsigned int issue_flags)
> > > > > +static void ublk_uring_cmd_cancel_fn(struct io_uring_cmd *cmd)
> > > > >   {
> > > > >       struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd);
> > > > >       struct ublk_queue *ubq = pdu->ubq;
> > > > > @@ -1464,7 +1462,7 @@ static void ublk_uring_cmd_cancel_fn(struct io_uring_cmd *cmd,
> > > > >       io = &ubq->ios[pdu->tag];
> > > > >       WARN_ON_ONCE(io->cmd != cmd);
> > > > > -    ublk_cancel_cmd(ubq, io, issue_flags);
> > > > > +    ublk_cancel_cmd(ubq, io);
> > > > 
> > > > .cancel_fn is always called with .uring_lock held, so this 'issue_flags' can't
> > > > be removed, otherwise double task run is caused because .cancel_fn
> > > > can be called multiple times if the request stays in ctx->cancelable_uring_cmd.
> > > 
> > > I see, that's exactly why I was asking whether it can be deferred
> > > to tw. Let me see if I can get by without that patch, but honestly
> > > it's a horrible abuse of the ring state. Any ideas how that can be
> > > cleaned up?
> > 
> > I assume io_uring_try_cancel_uring_cmd() can run in parallel with
> > completions, so there can be two parallel calls calls to ->uring_cmd
> > (e.g. io-wq + cancel), which gives me shivers. Also, I'd rather
> > no cancel in place requests of another task, io_submit_flush_completions()
> > but it complicates things.
> 
> I'm wrong though on flush_completions, the task there cancels only
> its own requests
> 
> io_uring_try_cancel_uring_cmd() {
> 	...
> 	if (!cancel_all && req->task != task)
> 		continue;
> }
> 
> 
> > Is there any argument against removing requests from the cancellation
> > list in io_uring_try_cancel_uring_cmd() before calling ->uring_cmd?
> > 
> > io_uring_try_cancel_uring_cmd() {
> >      lock();
> >      for_each_req() {
> >          remove_req_from_cancel_list(req);
> >          req->file->uring_cmd();
> >      }
> >      unlock();

Also the req may not be ready to cancel in ->uring_cmd(), so it
should be allowed to retry in future if it isn't canceled this time.

Thanks,
Ming