On Fri, Mar 21, 2025 at 08:24:43PM +0000, Pavel Begunkov wrote: > On 3/21/25 18:48, Caleb Sander Mateos wrote: > > To use ublk zero copy, an application submits a sequence of io_uring > > operations: > > (1) Register a ublk request's buffer into the fixed buffer table > > (2) Use the fixed buffer in some I/O operation > > (3) Unregister the buffer from the fixed buffer table > > > > The ordering of these operations is critical; if the fixed buffer lookup > > occurs before the register or after the unregister operation, the I/O > > will fail with EFAULT or even corrupt a different ublk request's buffer. > > It is possible to guarantee the correct order by linking the operations, > > but that adds overhead and doesn't allow multiple I/O operations to > > execute in parallel using the same ublk request's buffer. Ideally, the > > application could just submit the register, I/O, and unregister SQEs in > > the desired order without links and io_uring would ensure the ordering. > > This mostly works, leveraging the fact that each io_uring SQE is prepped > > and issued non-blocking in order (barring link, drain, and force-async > > flags). But it requires the fixed buffer lookup to occur during the > > initial non-blocking issue. > > In other words, leveraging internal details that is not a part > of the uapi, should never be relied upon by the user and is fragile. > Any drain request or IOSQE_ASYNC and it'll break, or for any reason > why it might be desirable to change the behaviour in the future. > > Sorry, but no, we absolutely can't have that, it'll be an absolute > nightmare to maintain as basically every request scheduling decision > now becomes a part of the uapi. > > There is an api to order requests, if you want to order them you > either have to use that or do it in user space. In your particular > case you can try to opportunistically issue them without ordering > by making sure the reg buffer slot is not reused in the meantime > and handling request failures. I agree, the order should be provided from UAPI/syscall level. SQE group does address this order issue, and now it can work with fixed buffer registering OP together. If no one objects, I will post out the patch for review. Thanks, Ming