On 10/30/2024 4:54 AM, Keith Busch wrote: > On Tue, Oct 29, 2024 at 09:53:58PM +0530, Anuj Gupta wrote: >> This patch adds the capability of sending metadata along with read/write. >> A new meta_type field is introduced in SQE which indicates the type of >> metadata being passed. This meta is represented by a newly introduced >> 'struct io_uring_meta_pi' which specifies information such as flags,buffer >> length,seed and apptag. Application sets up a SQE128 ring, prepares >> io_uring_meta_pi within the second SQE. >> The patch processes the user-passed information to prepare uio_meta >> descriptor and passes it down using kiocb->private. >> >> Meta exchange is supported only for direct IO. >> Also vectored read/write operations with meta are not supported >> currently. > > It looks like it is reasonable to add support for fixed buffers too. > There would be implications for subsequent patches, mostly patch 10, but > it looks like we can do that. Fixed buffers for data continues to be supported with this. Do you mean fixed buffers for metadata? We can take that as an incremental addition outside of this series which is already touching various subsystems (io_uring, block, nvme, scsi, fs). > Anyway, this patch mostly looks okay to me. I don't know about the whole > "meta_type" thing. My understanding from Pavel was wanting a way to > chain command specific extra options. Right. During LSFMM, he mentioned Btrfs needed to send extra stuff with read/write. But in general, this is about seeing metadata as a generic term to encode extra information into io_uring SQE. It may not be very uncommon that people will have the need to send extra stuff with read/write and add specific processing for that. And SQE->meta_type helps to isolate all such processing from the common case when no extra stuff is sent. if (sqe->meta_type) { if (type1(sqe->meta_type)) process(type1); if (type2(sqe>meta_type)) process(type1); } For example, userspace metadata > and write hints, and this doesn't look like it can be extended to do > that. It can be. And in past I used that to represent different types of write hints. Just that in the current version, write hints are being sent without any type.