On Fri, Jul 10, 2020 at 7:41 PM Kanchan Joshi <joshiiitr@xxxxxxxxx> wrote: > > On Fri, Jul 10, 2020 at 7:21 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > On Fri, Jul 10, 2020 at 02:49:32PM +0100, Christoph Hellwig wrote: > > > On Fri, Jul 10, 2020 at 02:48:24PM +0100, Matthew Wilcox wrote: > > > > If we're going to go the route of changing the CQE, how about: > > > > > > > > struct io_uring_cqe { > > > > __u64 user_data; /* sqe->data submission passed back */ > > > > - __s32 res; /* result code for this event */ > > > > - __u32 flags; > > > > + union { > > > > + struct { > > > > + __s32 res; /* result code for this event */ > > > > + __u32 flags; > > > > + }; > > > > + __s64 res64; > > > > + }; > > > > }; > > > > > > > > then we don't need to change the CQE size and it just depends on the SQE > > > > whether the CQE for it uses res+flags or res64. > > > > > > How do you return a status code or short write when you just have > > > a u64 that is needed for the offset? > > > > it's an s64 not a u64 so you can return a negative errno. i didn't > > think we allowed short writes for objects-which-have-a-pos. > > If we are doing this for zone-append (and not general cases), "__s64 > res64" should work -. > 64 bits = 1 (sign) + 23 (bytes-copied: cqe->res) + 40 > (written-location: chunk_sector bytes limit) And this is for the scheme when single CQE is used with bits refactoring into "_s64 res64" instead of res/flags. 41 bits for zone-append completion = in bytes, sufficient to cover chunk_sectors size zone 1+22 bits for zone-append bytes-copied = can cover 4MB bytes copied (single I/O is capped at 4MB in NVMe) + * zone-append specific flags +#define APPEND_OFFSET_BITS (41) +#define APPEND_RES_BITS (23) + +/* * IO completion data structure (Completion Queue Entry) */ struct io_uring_cqe { - __u64 user_data; /* sqe->data submission passed back */ - __s32 res; /* result code for this event */ - __u32 flags; + __u64 user_data; /* sqe->data submission passed back */ + union { + struct { + __s32 res; /* result code for this event */ + __u32 flags; + }; + /* Alternate for zone-append */ + struct { + union { + /* + * kernel uses this to store append result + * Most significant 23 bits to return number of + * bytes or error, and least significant 41 bits + * to return zone-relative offset in bytes + * */ + __s64 res64; + /*for user-space ease, kernel does not use*/ + struct { +#if defined(__LITTLE_ENDIAN_BITFIELD) + __u64 append_offset : APPEND_OFFSET_BITS; + __s32 append_res : APPEND_RES_BITS; +#elif defined(__BIG_ENDIAN_BITFIELD) + __s32 append_res : APPEND_RES_BITS; + __u64 append_offset : APPEND_OFFSET_BITS; +#endif + }__attribute__ ((__packed__)); + }; + }; + }; }; -- Joshi