Hi Jens, Am 28.01.21 um 03:19 schrieb Jens Axboe: >>> Assuming that I got that right, that means that the pdu information >>> doesn't actually go all the way to the end of the sqe, which currently >>> is just a bunch of padding. Was that intentional, or does this mean >>> that io_uring_pdu could actually be 8 bytes longer? >> >> Also correct. The reason is actually kind of stupid, and I think we >> should just fix that up. struct io_uring_cmd should fit within the first >> cacheline of io_kiocb, to avoid bloating that one. But with the members >> in there, it ends up being 8 bytes too big, if we grab those 8 bytes. >> What I think we should do is get rid of ->done, and just have drivers >> call io_uring_cmd_done() instead. We can provide an empty hook for that. >> Then we can reclaim the 8 bytes, and grow the io_uring_cmd to 56 bytes. > > Pushed out that version: > > https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-fops.v2 > > which gives you the full 56 bytes for the payload command. I think we only have 48 bytes for the payload. I've rebased and improved your io_uring-fops.v2 on top of your io_uring-worker.v3. See https://git.samba.org/?p=metze/linux/wip.git;a=shortlog;h=refs/heads/io_uring-fops I've changed the layout like this: struct io_uring_sqe { __u8 opcode; /* type of operation for this sqe */ __u8 flags; /* IOSQE_ flags */ union { __u16 ioprio; /* ioprio for the request */ __u16 cmd_personality; /* IORING_OP_URING_CMD */ }; __s32 fd; /* file descriptor to do IO on */ union { __u64 off; /* offset into file */ __u64 addr2; __u64 cmd_user_data; /* IORING_OP_URING_CMD: data to be passed back at completion time */ }; union { __u64 addr; /* pointer to buffer or iovecs */ __u64 splice_off_in; __u64 cmd_pdu_start; /* IORING_OP_URING_CMD: this is the start for the remaining 48 bytes */ }; And then use: struct io_uring_cmd_pdu { __u64 data[6]; /* 48 bytes available for free use */ }; So we effectively have this: struct io_uring_cmd_sqe { __u8 opcode; /* type of operation for this sqe */ __u8 flags; /* IOSQE_ flags */ __u16 cmd_personality; /* IORING_OP_URING_CMD */ __s32 fd; /* file descriptor to do IO on */ __u64 cmd_user_data; /* IORING_OP_URING_CMD: data to be passed back at completion time */ union { __u64 cmd_pdu_start; /* IORING_OP_URING_CMD: this is the start for the remaining 48 bytes */ struct io_uring_cmd_pdu cmd_pdu; }; } I think it's saner to have a complete block of 48 bytes available for the payload and move personality and user_data to to top if opcode is IORING_OP_URING_CMD instead of having a hole that can't be touched. I also finished the socket glue from struct file -> struct socket -> struct sock I think it compiles, but I haven't done any tests. What do you think? metze
Attachment:
signature.asc
Description: OpenPGP digital signature