Re: [PATCH 0/6] ublk zero-copy support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 07, 2025 at 11:51:49AM +0800, Ming Lei wrote:
> On Mon, Feb 03, 2025 at 07:45:11AM -0800, Keith Busch wrote:
> > 
> > The previous version from Ming can be viewed here:
> > 
> >   https://lore.kernel.org/linux-block/20241107110149.890530-1-ming.lei@xxxxxxxxxx/
> > 
> > Based on the feedback from that thread, the desired io_uring interfaces
> > needed to be simpler, and the kernel registered resources need to behave
> > more similiar to user registered buffers.
> > 
> > This series introduces a new resource node type, KBUF, which, like the
> > BUFFER resource, needs to be installed into an io_uring buf_node table
> > in order for the user to access it in a fixed buffer command. The
> > new io_uring kernel API provides a way for a user to register a struct
> > request's bvec to a specific index, and a way to unregister it.
> > 
> > When the ublk server receives notification of a new command, it must
> > first select an index and register the zero copy buffer. It may use that
> > index for any number of fixed buffer commands, then it must unregister
> > the index when it's done. This can all be done in a single io_uring_enter
> > if desired, or it can be split into multiple enters if needed.
> 
> I suspect it may not be done in single io_uring_enter() because there
> is strict dependency among the three OPs(register buffer, read/write,
> unregister buffer).

The registration is synchronous. io_uring completes the SQE entirely
before it even looks at the read command in the next SQE.

The read or write is asynchronous, but it's prep takes a reference on
the node before moving on to the next SQE..

The unregister is synchronous, and clears the index node, but the
possibly inflight read or write has a reference on that node, so all
good.

> > +		ublk_get_sqe_three(q->ring_ptr, &reg, &read, &ureg);
> > +
> > +		io_uring_prep_buf_register(reg, 0, tag, q->q_id, tag);
> > +
> > +		io_uring_prep_read_fixed(read, 1 /*fds[1]*/,
> > +			0,
> > +			iod->nr_sectors << 9,
> > +			iod->start_sector << 9,
> > +			tag);
> > +		io_uring_sqe_set_flags(read, IOSQE_FIXED_FILE);
> > +		read->user_data = build_user_data(tag, ublk_op, 0, 1);
> 
> Does this interface support to read to partial buffer? Which is useful
> for stacking device cases.

Are you wanting to read into this buffer without copying in parts? As in
provide an offset and/or smaller length across multiple commands? If
that's what you mean, then yes, you can do that here.
 
> Also does this interface support to consume the buffer from multiple
> OPs concurrently? 

You can register as many kernel buffers from as many OPs as you have
space for in your table, and you can use them all concurrently. Pretty
much the same as user registered fixed buffers. The main difference from
user buffers is how you register them.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux