On 4/19/19 3:29 AM, Stefan Bühler wrote: > Hi, > > On 17.04.19 17:02, Jens Axboe wrote: >> [...] >> >> Care to turn the pseudo code into a real patch? Would be easier to >> digest. But thanks for taking a look at this! > > Fair enough; I'm working on it. > > I'll start with the actual bugs, and will try to cleanup unneeded > barriers/docs later. That sounds great, thanks. The three patches look good to me, I've queued them up. >>> The "userspace" liburing looks even worse. For starters, >>> `io_uring_get_completion` cannot safely return a pointer to the CQ entry >>> *AND* increment head: you need to actually read the entry before >>> incrementing head. >> >> That's a good point, we should actually have this split in two to avoid >> the case where the kernel will then fill in a new entry for us. > > I see you already did so in the liburing repo. Yep > When going through liburing code again, I noticed io_uring_submit > assumes there might be pending submission entries in the SQ queue. > > But those entries are not "reserved" in the SQE queue; so > io_uring_get_sqe might overwrite SQE data that isn't read by the kernel > through SQ yet. > > Is there any motivation behind the indirection? I see two possible ideas: > - allocate fixed SQE entries for operations, and just add them to SQ > when needed > - have multiple threads fill SQE in parallel, and only add them to SQ > when done > > Are those actually worth the cost? > > liburing doesn't look like it is going to take advantage of it, and the > library I'm writing in Rust right now isn't going to either (actually > even using a fixed sq_ndx == sqe_ndx mapping, i.e. it'll only write > sq->array on startup). > > At least consider documenting the motivation :) I'll take a look at this case and see if the complexity is warranted, or at least make sure that it's exposed in a safe fashion. >>> Last but not least the kernel side has two lines it checks whether a >>> queue is full or not and uses `tail + 1 == head`: this is only true if >>> there were U32_MAX entries in the queue. You should check `(tail - >>> head) == SIZE` instead. >> >> Good catch, this is a leftover from when the rings were indeed capped by >> the mask of the entries. I've fixed this one and pushed it out, and >> added a liburing test case for it as well. > > I think you missed the second one in io_uring_poll (looking at your > for-linus branch), so I will send a patch for that too. I did indeed, my grep failed me. Thanks for catching that! -- Jens Axboe