Re: [PATCH V8 0/8] io_uring: support sqe group and leased group kbuf

Jens Axboe <axboe@xxxxxxxxx> · Wed, 30 Oct 2024 07:20:48 -0600

On 10/29/24 10:11 PM, Ming Lei wrote:
> On Wed, Oct 30, 2024 at 11:08:16AM +0800, Ming Lei wrote:
>> On Tue, Oct 29, 2024 at 08:43:39PM -0600, Jens Axboe wrote:
> 
> ...
> 
>>> You could avoid the OP dependency with just a flag, if you really wanted
>>> to. But I'm not sure it makes a lot of sense. And it's a hell of a lot
>>
>> Yes, IO_LINK won't work for submitting multiple IOs concurrently, extra
>> syscall makes application too complicated, and IO latency is increased.
>>
>>> simpler than the sqe group scheme, which I'm a bit worried about as it's
>>> a bit complicated in how deep it needs to go in the code. This one
>>> stands alone, so I'd strongly encourage we pursue this a bit further and
>>> iron out the kinks. Maybe it won't work in the end, I don't know, but it
>>> seems pretty promising and it's soooo much simpler.
>>
>> If buffer register and lookup are always done in ->prep(), OP dependency
>> may be avoided.
> 
> Even all buffer register and lookup are done in ->prep(), OP dependency
> still can't be avoided completely, such as:
> 
> 1) two local buffers for sending to two sockets
> 
> 2) group 1: IORING_OP_LOCAL_KBUF1 & [send(sock1), send(sock2)]  
> 
> 3) group 2: IORING_OP_LOCAL_KBUF2 & [send(sock1), send(sock2)]
> 
> group 1 and group 2 needs to be linked, but inside each group, the two
> sends may be submitted in parallel.

That is where groups of course work, in that you can submit 2 groups and
have each member inside each group run independently. But I do think we
need to decouple the local buffer and group concepts entirely. For the
first step, getting local buffers working with zero copy would be ideal,
and then just live with the fact that group 1 needs to be submitted
first and group 2 once the first ones are done.

Once local buffers are done, we can look at doing the sqe grouping in a
nice way. I do think it's a potentially powerful concept, but we're
going to make a lot more progress on this issue if we carefully separate
dependencies and get each of them done separately.

-- 
Jens Axboe