Re: [PATCH v7 3/3] io_uring: enable per-io hinting capability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/30/24 19:13, Kanchan Joshi wrote:
With F_SET_RW_HINT fcntl, user can set a hint on the file inode, and
all the subsequent writes on the file pass that hint value down.
This can be limiting for large files (and for block device) as all the
writes can be tagged with only one lifetime hint value.
Concurrent writes (with different hint values) are hard to manage.
Per-IO hinting solves that problem.

Allow userspace to pass additional metadata in the SQE.
The type of passed metadata is expressed by a new field

	__u16 meta_type;

The new layout looks nicer, but let me elaborate on the previous
comment. I don't believe we should be restricting to only one
attribute per IO. What if someone wants to pass a lifetime hint
together with integrity information?

Instead, we might need something more extensible like an ability
to pass a list / array of typed attributes / meta information / hints
etc. An example from networking I gave last time was control messages,
i.e. cmsg. In a basic oversimplified form the API from the user
perspective could look like:

struct meta_attr {
	u16 type;
	u64 data;
};

struct meta_attr attr[] = {{HINT, hint_value}, {INTEGRITY, ptr}};
sqe->meta_attrs = attr;
sqe->meta_nr = 2;

I'm pretty sure there will be a bunch of considerations people
will name the API should account for, like sizing the struct
so it can fit integrity bits or even making it variable sized.


At this point one type META_TYPE_LIFETIME_HINT is supported.
With this type, user can pass lifetime hint values in the new field

	__u64 lifetime_val;

This accepts all lifetime hint values that are possible with
F_SET_RW_HINT fcntl.

The write handlers (io_prep_rw, io_write) send the hint value to
lower-layer using kiocb. This is good for upporting direct IO,
but not when kiocb is not available (e.g., buffered IO).

When per-io hints are not passed, the per-inode hint values are set in
the kiocb (as before). Otherwise, these take the precedence on per-inode
hints.
--
Pavel Begunkov




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux