Re: [PATCH 2/9] io_uring: support user sqe ext flags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/30/24 04:43, Ming Lei wrote:
On Mon, Apr 29, 2024 at 04:24:54PM +0100, Pavel Begunkov wrote:
On 4/23/24 14:57, Ming Lei wrote:
On Mon, Apr 22, 2024 at 12:16:12PM -0600, Jens Axboe wrote:
On 4/7/24 7:03 PM, Ming Lei wrote:
sqe->flags is u8, and now we have used 7 bits, so take the last one for
extending purpose.

If bit7(IOSQE_HAS_EXT_FLAGS_BIT) is 1, it means this sqe carries ext flags
from the last byte(.ext_flags), or bit23~bit16 of sqe->uring_cmd_flags for
IORING_OP_URING_CMD.

io_slot_flags() return value is converted to `ULL` because the affected bits
are beyond 32bit now.

If we're extending flags, which is something we arguably need to do at
some point, I think we should have them be generic and not spread out.

Sorry, maybe I don't get your idea, and the ext_flag itself is always
initialized in io_init_req(), like normal sqe->flags, same with its
usage.

If uring_cmd needs specific flags and don't have them, then we should
add it just for that.

The only difference is that bit23~bit16 of sqe->uring_cmd_flags is
borrowed for uring_cmd's ext flags, because sqe byte0~47 have been taken,
and can't be reused for generic flag. If we want to use byte48~63, it has
to be overlapped with uring_cmd's payload, and it is one generic sqe
flag, which is applied on uring_cmd too.

Which is exactly the mess nobody would want to see. And I'd also

The trouble is introduced by supporting uring_cmd, and solving it by setting
ext flags for uring_cmd specially by liburing helper is still reasonable or
understandable, IMO.

argue 8 extra bits is not enough anyway, otherwise the history will
repeat itself pretty soon

It is started with 8 bits, now doubled when io_uring is basically
mature, even though history might repeat, it will take much longer time

You're mistaken, only 7 bits are taken not because there haven't been
ideas and need to use them, but because we're out of space and we've
been saving it for something that might be absolutely necessary.

POLL_FIRST IMHO should've been a generic feature, but it worked around
being a send/recv specific flag, same goes for the use of registered
buffers, not to mention ideas for which we haven't had enough flag space.

That is the only way I thought of, or any other suggestion for extending sqe
flags generically?

idea 1: just use the last bit. When we need another one it'd be time
to think about a long overdue SQE layout v2, this way we can try
to make flags u32 and clean up other problems.

It looks over-kill to invent SQE v2 just for solving the trouble in
uring_cmd, and supporting two layouts can be new trouble for io_uring.

Sounds too uring_cmd centric, it's not specifically for uring_cmd, it's
just one of reasons. As for overkill, that's why I'm not telling you
to change the layour, but suggesting to take the last bit for the
group flag and leave future problems for the future.


Also I doubt the problem can be solved in layout v2:

- 64 byte is small enough to support everything, same for v2

- uring_cmd has only 16 bytes payload, taking any byte from
the payload may cause trouble for drivers

- the only possible change could still be to suppress bytes for OP
specific flags, but it might cause trouble for some OPs, such as
network.

Look up sqe's __pad1, for example


idea 2: the group assembling flag can move into cmds. Very roughly:

io_cmd_init() {
	ublk_cmd_init();
}

ublk_cmd_init() {
	io_uring_start_grouping(ctx, cmd);
}

io_uring_start_grouping(ctx, cmd) {
	ctx->grouping = true;
	ctx->group_head = cmd->req;
}

How can you know one group is starting without any flag? Or you still
suggest the approach taken in fused command?

That would be ublk's business, e.g. ublk or cmds specific flag


submit_sqe() {
	if (ctx->grouping) {
		link_to_group(req, ctx->group_head);
		if (!(req->flags & REQ_F_LINK))
			ctx->grouping = false;
	}
}

The group needs to be linked to existed link chain, so reusing REQ_F_LINK may
not doable.

Would it break zero copy feature if you cant?

--
Pavel Begunkov




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux