On 10/2/2024 7:56 PM, Pavel Begunkov wrote: > On 9/30/24 19:13, Kanchan Joshi wrote: >> With F_SET_RW_HINT fcntl, user can set a hint on the file inode, and >> all the subsequent writes on the file pass that hint value down. >> This can be limiting for large files (and for block device) as all the >> writes can be tagged with only one lifetime hint value. >> Concurrent writes (with different hint values) are hard to manage. >> Per-IO hinting solves that problem. >> >> Allow userspace to pass additional metadata in the SQE. >> The type of passed metadata is expressed by a new field >> >> __u16 meta_type; > > The new layout looks nicer, but let me elaborate on the previous > comment. I don't believe we should be restricting to only one > attribute per IO. What if someone wants to pass a lifetime hint > together with integrity information? For that reason only I made meta_type to accept multiple bit values. META_TYPE_LIFETIME_HINT and a new META_TYPE_INTEGRITY can coexist. Overall 16 meta types can coexist. > Instead, we might need something more extensible like an ability > to pass a list / array of typed attributes / meta information / hints > etc. An example from networking I gave last time was control messages, > i.e. cmsg. In a basic oversimplified form the API from the user > perspective could look like: > > struct meta_attr { > u16 type; > u64 data; > }; > > struct meta_attr attr[] = {{HINT, hint_value}, {INTEGRITY, ptr}}; > sqe->meta_attrs = attr; > sqe->meta_nr = 2; I did not feel like adding a pointer (and have copy_from_user cost) for integrity. Currently integrity uses space in second SQE which seems fine [*]. Down the line if meta-types increase and we are on verge of low SQE space, we can resort to add indirect reference. [*] https://lore.kernel.org/linux-nvme/20241016112912.63542-8-anuj20.g@xxxxxxxxxxx/