On 2022/9/1 09:34, Ming Lei wrote: > >> This makes existing backend very hard to adapt to ublk because they may >> want to know the data length or other attributes of the new request. > > It is just for existing project. Existing project is very important. I believe that embedding libublksrv/UAPI into existing projects(products) makes ublk more popular and useful. > > Any new project can read the data from the pre-allocated buffer > directly. That is exactly the handling flow: ublksrv gets one request from > ublk driver, then let backend handle the request. Your are correct, Ming. ublksrv tgts does not need UBLK_IO_NEED_GET_DATA. > >> >> (2) If the backend does not provide the data buffer IN ADVANCE, ublksrv must >> pre-allocates data buffer. So a additional data copy from ublksrv to >> the backend(such as a RPC mempool) is unavoidable. > > Can you explain why backend can't use the pre-allocated buffer directly? Before > backend completes the io request, the io request and buffer won't be reused, that > is owned by this tag/slot. For existing projects using ublksrv, why it must use ublksrv's pre-allocated buffer? The backend has its own buffer management. Besides, existing projects may directly embed libublksrv/UAPI into it. UBLK_IO_NEED_GET_DATA is just an option for them. Ming, UBLK_IO_NEED_GET_DATA usecases has been proved useful and we have discussed it when I introduced it into kernel driver. Really (1)users use ublksrv directly or (2)developers implement new ublksrv targets do not have to care about it. > >> >> With UBLK_IO_NEED_GET_DATA, the WRITE request will be firstly issued to ublksrv >> without data copy. Then, backend gets the request and it can allocate data >> buffer and embed its addr inside a new ioucmd. After the kernel driver gets the >> ioucmd, the data copy happens(from biovecs to backend's buffer). Finally, >> the backend gets the request again with data to be written and it can truly >> handle the request. > > That is definitely inefficient, and I won't encourage any new project to > use this command. UBLK_IO_NEED_GET_DATA is an option. Any user thinks that it may lower performance should not use it. BTW, our tests shows that UBLK_IO_NEED_GET_DATA add one additional round-trip in ublk_drv and one io_uring_enter() syscall. UBLK_IO_NEED_GET_DATA does not lower the IOPS too much if: (1) iodepth is bigger. This is because io_uring batches sqes(ioucmds) so the syscall overhead is not significant. (2) the backend is slow. For example, with a network(RPC) backend, we really do not care this round-trip since the backend IO handling is far slower than ublk_drv's data path. In conclusion, UBLK_IO_NEED_GET_DATA is designed for existing projects, not for ublksrv(though it supports this feature) targets. UBLK_IO_NEED_GET_DATA is COMPLETELY motivated by our real practice in developing userspace storage products. Regards, Zhang