On Sat, Jul 06, 2024 at 11:09:50AM +0800, Ming Lei wrote: > Hello, > > The 1st 3 patches are cleanup, and prepare for adding sqe group. > > The 4th patch supports generic sqe group which is like link chain, but > allows each sqe in group to be issued in parallel and the group shares > same IO_LINK & IO_DRAIN boundary, so N:M dependency can be supported with > sqe group & io link together. sqe group changes nothing on > IOSQE_IO_LINK. > > The 5th patch supports one variant of sqe group: allow members to depend > on group leader, so that kernel resource lifetime can be aligned with > group leader or group, then any kernel resource can be shared in this > sqe group, and can be used in generic device zero copy. > > The 6th & 7th patches supports providing sqe group buffer via the sqe > group variant. > > The 8th patch supports ublk zero copy based on io_uring providing sqe > group buffer. > > Tests: > > 1) pass liburing test > - make runtests > > 2) write/pass two sqe group test cases: > > https://github.com/axboe/liburing/compare/master...ming1:liburing:sqe_group_v2 > > - covers related sqe flags combination and linking groups, both nop and > one multi-destination file copy. > > - cover failure handling test: fail leader IO or member IO in both single > group and linked groups, which is done in each sqe flags combination > test > > 3) ublksrv zero copy: > > ublksrv userspace implements zero copy by sqe group & provide group > kbuf: > > git clone https://github.com/ublk-org/ublksrv.git -b group-provide-buf_v2 > make test T=loop/009:nbd/061 #ublk zc tests > > When running 64KB/512KB block size test on ublk-loop('ublk add -t loop --buffered_io -f $backing'), > it is observed that perf is doubled. > > Any comments are welcome! > > V4: > - address most comments from Pavel > - fix request double free > - don't use io_req_commit_cqe() in io_req_complete_defer() > - make members' REQ_F_INFLIGHT discoverable > - use common assembling check in submission code path > - drop patch 3 and don't move REQ_F_CQE_SKIP out of io_free_req() > - don't set .accept_group_kbuf for net send zc, in which members > need to be queued after buffer notification is got, and can be > enabled in future > - add .grp_leader field via union, and share storage with .grp_link > - move .grp_refs into one hole of io_kiocb, so that one extra > cacheline isn't needed for io_kiocb > - cleanup & document improvement Hello Pavel, Jens and Guys, Gentle ping... thanks, Ming