On Wed, Aug 31, 2022 at 02:31:12PM +0800, Ziyang Zhang wrote: > On 2022/8/30 23:23, Stefan Hajnoczi wrote: > > On Sun, Aug 28, 2022 at 12:50:03PM +0800, Ming Lei wrote: > >> +- UBLK_IO_NEED_GET_DATA > >> + ublksrv pre-allocates IO buffer for each IO at default, any new project > >> + should use this IO buffer to communicate with ublk driver. But existed > >> + project may not work or be changed to in this way, so add this command > >> + to provide chance for userspace to use its existed buffer for handling > >> + IO. > > > > I find it hard to understand this paragraph. It seems the > > UBLK_IO_NEED_GET_DATA command allows userspace to set up something > > related to IO buffers. What exactly does this command do? > > Let me explain UBLK_IO_NEED_GET_DATA since it is designed by myself. > > Without UBLK_IO_NEED_GET_DATA, ublk_drv will copy data from biovecs > into a pre-allocated buffer(addr is passed with the last COMMIT_AMD_FETCH ioucmd) > while processing a WRITE request. Please consider two cases: > > (1) if the backend(such as a dist-storage system using RPC) provides the data > buffer, it has to provide the buffer IN ADVANCE(before sending the last > COMMIT_AMD_FETCH) without knowing any knowledge of this incoming request. > This makes existing backend very hard to adapt to ublk because they may > want to know the data length or other attributes of the new request. > > (2) If the backend does not provide the data buffer IN ADVANCE, ublksrv must > pre-allocates data buffer. So a additional data copy from ublksrv to > the backend(such as a RPC mempool) is unavoidable. > > With UBLK_IO_NEED_GET_DATA, the WRITE request will be firstly issued to ublksrv > without data copy. Then, backend gets the request and it can allocate data > buffer and embed its addr inside a new ioucmd. After the kernel driver gets the > ioucmd, the data copy happens(from biovecs to backend's buffer). Finally, > the backend gets the request again with data to be written and it can truly > handle the request. Thanks for the explanation. Maybe it can be included in the documentation. This reminds me of io_uring's IOSQE_BUFFER_SELECT where userspace provides the kernel with a buffer pool and the kernel selects buffers. It doesn't require an extra io_uring command roundtrip (UBLK_IO_NEED_GET_DATA). Did you already look at IOSQE_BUFFER_SELECT and decide a similar approach won't work for your use case? Stefan
Attachment:
signature.asc
Description: PGP signature