Re: [PATCH V6 7/8] io_uring/uring_cmd: support provide group kernel buffer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 09, 2024 at 04:14:33PM +0100, Pavel Begunkov wrote:
> On 10/6/24 09:46, Ming Lei wrote:
> > On Fri, Oct 04, 2024 at 04:44:54PM +0100, Pavel Begunkov wrote:
> > > On 9/12/24 11:49, Ming Lei wrote:
> > > > Allow uring command to be group leader for providing kernel buffer,
> > > > and this way can support generic device zero copy over device buffer.
> > > > 
> > > > The following patch will use the way to support zero copy for ublk.
> > > > 
> > > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > > > ---
> > > >    include/linux/io_uring/cmd.h  |  7 +++++++
> > > >    include/uapi/linux/io_uring.h |  7 ++++++-
> > > >    io_uring/uring_cmd.c          | 28 ++++++++++++++++++++++++++++
> > > >    3 files changed, 41 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/include/linux/io_uring/cmd.h b/include/linux/io_uring/cmd.h
> > > > index 447fbfd32215..fde3a2ec7d9a 100644
> > > > --- a/include/linux/io_uring/cmd.h
> > > > +++ b/include/linux/io_uring/cmd.h
> > > > @@ -48,6 +48,8 @@ void __io_uring_cmd_do_in_task(struct io_uring_cmd *ioucmd,
> > > >    void io_uring_cmd_mark_cancelable(struct io_uring_cmd *cmd,
> > > >    		unsigned int issue_flags);
> > > > +int io_uring_cmd_provide_kbuf(struct io_uring_cmd *ioucmd,
> > > > +		const struct io_uring_kernel_buf *grp_kbuf);
> > > >    #else
> > > >    static inline int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw,
> > > >    			      struct iov_iter *iter, void *ioucmd)
> > > > @@ -67,6 +69,11 @@ static inline void io_uring_cmd_mark_cancelable(struct io_uring_cmd *cmd,
> > > >    		unsigned int issue_flags)
> > > >    {
> > > >    }
> > > > +static inline int io_uring_cmd_provide_kbuf(struct io_uring_cmd *ioucmd,
> > > > +		const struct io_uring_kernel_buf *grp_kbuf)
> > > > +{
> > > > +	return -EOPNOTSUPP;
> > > > +}
> > > >    #endif
> > > >    /*
> > > > diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> > > > index 2af32745ebd3..11985eeac10e 100644
> > > > --- a/include/uapi/linux/io_uring.h
> > > > +++ b/include/uapi/linux/io_uring.h
> > > > @@ -271,9 +271,14 @@ enum io_uring_op {
> > > >     * sqe->uring_cmd_flags		top 8bits aren't available for userspace
> > > >     * IORING_URING_CMD_FIXED	use registered buffer; pass this flag
> > > >     *				along with setting sqe->buf_index.
> > > > + * IORING_PROVIDE_GROUP_KBUF	this command provides group kernel buffer
> > > > + *				for member requests which can retrieve
> > > > + *				any sub-buffer with offset(sqe->addr) and
> > > > + *				len(sqe->len)
> > > 
> > > Is there a good reason it needs to be a cmd generic flag instead of
> > > ublk specific?
> > 
> > io_uring request isn't visible for drivers, so driver can't know if the
> > uring command is one group leader.
> 
> btw, does it have to be in a group at all? Sure, nobody would be
> able to consume the buffer, but otherwise should be fine.
> 
> > Another way is to add new API of io_uring_cmd_may_provide_buffer(ioucmd)
> 
> The checks can be done inside of io_uring_cmd_provide_kbuf()

Yeah.

Now the difference is just that:

- user may know it explicitly(UAPI flag) or implicitly(driver's ->cmd_op),
- if driver knows this uring_cmd is one group leader

I am fine with either way.

> 
> > so driver can check if device buffer can be provided with this uring_cmd,
> > but I prefer to the new uring_cmd flag:
> > 
> > - IORING_PROVIDE_GROUP_KBUF can provide device buffer in generic way.
> 
> Ok, could be.
> 
> > - ->prep() can fail fast in case that it isn't one group request
> 
> I don't believe that matters, a behaving user should never
> see that kind of failure.
> 
> 
> > > 1. Extra overhead for files / cmds that don't even care about the
> > > feature.
> > 
> > It is just checking ioucmd->flags in ->prep(), and basically zero cost.
> 
> It's not if we add extra code for each every feature, at
> which point it becomes a maze of such "ifs".

Yeah, I guess it can't be avoided in current uring_cmd design, which
serves for different subsystems now, and more in future.

And the situation is similar with ioctl.

> 
> > > 2. As it stands with this patch, the flag is ignored by all other
> > > cmd implementations, which might be quite confusing as an api,
> > > especially so since if we don't set that REQ_F_GROUP_KBUF memeber
> > > requests will silently try to import a buffer the "normal way",
> > 
> > The usage is same with buffer select or fixed buffer, and consumer
> > has to check the flag.
> 
> We fails requests when it's asked to use the feature but
> those are not supported, at least non-cmd requests.
> 
> > And same with IORING_URING_CMD_FIXED which is ignored by other
> > implementations except for nvme, :-)
> 
> Oh, that's bad. If you'd try to implement the flag in the
> future it might break the uapi. It might be worth to patch it
> up on the ublk side, i.e. reject the flag, + backport, and hope
> nobody tried to use them together, hmm?
> 
> > I can understand the concern, but it exits since uring cmd is born.
> > 
> > > i.e. interpret sqe->addr or such as the target buffer.
> > 
> > > 3. We can't even put some nice semantics on top since it's
> > > still cmd specific and not generic to all other io_uring
> > > requests.
> > > 
> > > I'd even think that it'd make sense to implement it as a
> > > new cmd opcode, but that's the business of the file implementing
> > > it, i.e. ublk.
> > > 
> > > >     */
> > > >    #define IORING_URING_CMD_FIXED	(1U << 0)
> > > > -#define IORING_URING_CMD_MASK	IORING_URING_CMD_FIXED
> > > > +#define IORING_PROVIDE_GROUP_KBUF	(1U << 1)
> > > > +#define IORING_URING_CMD_MASK	(IORING_URING_CMD_FIXED | IORING_PROVIDE_GROUP_KBUF)
> > 
> > It needs one new file operation, and we shouldn't work toward
> > this way.
> 
> Not a new io_uring request, I rather meant sqe->cmd_op,
> like UBLK_U_IO_FETCH_REQ_PROVIDER_BUFFER.

`cmd_op` is supposed to be defined by subsystems, but maybe we can
reserve some for generic uring_cmd. Anyway this shouldn't be one big
deal, we can do that in future if there are more such uses.


Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux