Re: [PATCH V8 4/7] io_uring: support SQE group

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/31/24 3:39 PM, Jens Axboe wrote:
> On 10/31/24 3:24 PM, Jens Axboe wrote:
>> On 10/25/24 6:22 AM, Ming Lei wrote:
>>> SQE group is defined as one chain of SQEs starting with the first SQE that
>>> has IOSQE_SQE_GROUP set, and ending with the first subsequent SQE that
>>> doesn't have it set, and it is similar with chain of linked SQEs.
>>>
>>> Not like linked SQEs, each sqe is issued after the previous one is
>>> completed. All SQEs in one group can be submitted in parallel. To simplify
>>> the implementation from beginning, all members are queued after the leader
>>> is completed, however, this way may be changed and leader and members may
>>> be issued concurrently in future.
>>>
>>> The 1st SQE is group leader, and the other SQEs are group member. The whole
>>> group share single IOSQE_IO_LINK and IOSQE_IO_DRAIN from group leader, and
>>> the two flags can't be set for group members. For the sake of
>>> simplicity, IORING_OP_LINK_TIMEOUT is disallowed for SQE group now.
>>>
>>> When the group is in one link chain, this group isn't submitted until the
>>> previous SQE or group is completed. And the following SQE or group can't
>>> be started if this group isn't completed. Failure from any group member will
>>> fail the group leader, then the link chain can be terminated.
>>>
>>> When IOSQE_IO_DRAIN is set for group leader, all requests in this group and
>>> previous requests submitted are drained. Given IOSQE_IO_DRAIN can be set for
>>> group leader only, we respect IO_DRAIN by always completing group leader as
>>> the last one in the group. Meantime it is natural to post leader's CQE
>>> as the last one from application viewpoint.
>>>
>>> Working together with IOSQE_IO_LINK, SQE group provides flexible way to
>>> support N:M dependency, such as:
>>>
>>> - group A is chained with group B together
>>> - group A has N SQEs
>>> - group B has M SQEs
>>>
>>> then M SQEs in group B depend on N SQEs in group A.
>>>
>>> N:M dependency can support some interesting use cases in efficient way:
>>>
>>> 1) read from multiple files, then write the read data into single file
>>>
>>> 2) read from single file, and write the read data into multiple files
>>>
>>> 3) write same data into multiple files, and read data from multiple files and
>>> compare if correct data is written
>>>
>>> Also IOSQE_SQE_GROUP takes the last bit in sqe->flags, but we still can
>>> extend sqe->flags with io_uring context flag, such as use __pad3 for
>>> non-uring_cmd OPs and part of uring_cmd_flags for uring_cmd OP.
>>
>> Did you run the liburing tests with this? I rebased it on top of the
>> flags2 patch I just sent out, and it fails defer-taskrun and crashes
>> link_drain. Don't know if others fail too. I'll try the original one
>> too, but nothing between those two should make a difference. It passes
>> just fine with just the flags2 patch, so I'm a bit suspicious this patch
>> is the issue.
> 
> False alarm, it was my messup adding the group flag. Works just fine.
> I'm attaching the version I tested, on top of that flags2 patch.
> 
> Since we're on the topic - my original bundle patch used a bundle OP to
> define an sqe grouping, which didn't need to use an sqe flag. Any
> particular reason why you went with a flag for this one?
> 
> I do think it comes out nicer with a flag for certain things, like being
> able to link groups. Maybe that's the primary reason.

Various hickups, please just see the patches here, works now:

https://git.kernel.dk/cgit/linux/log/?h=io_uring-group

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux