On 11/15/24 17:12, Christoph Hellwig wrote:
On Fri, Nov 15, 2024 at 04:40:58PM +0000, Pavel Begunkov wrote:
So? If we have a strong enough requirement for something else we
can triviall add another opcode. Maybe we should just add different
opcodes for read/write with metadata so that folks don't freak out
about this?
IMHO, PI is not so special to have a special opcode for it unlike
some more generic read/write with meta / attributes, but that one
would have same questions.
Well, apparently is one the hand hand not general enough that you
don't want to give it SQE128 space, but you also don't want to give
it an opcode.
Not like there are no other options. It can be user pointers,
and now we have infra to optimise it if copy_from_user is
expensive.
One thing that doesn't feel right is double indirection, i.e.
a uptr into an array of pointers, at least if IIUC from a quick
glance. I'll follow up on that.
Maybe we just need make it uring_cmd to get out of these conflicting
requirements.
Just to make it clear: I'm not a huge fan of a separate opcode or
uring_cmd, but compared to the version in this patch it is much better.
PI as a special case. And that's more of a problem of the static
placing from previous version, e.g. it wouldn't be a problem if in the
long run it becomes sth like:
struct attr attr, *p;
if (flags & META_IN_USE_SQE128)
p = sqe + 1;
else {
copy_from_user(&attr);
p = &attr;
}
but that shouldn't be PI specific.
Why would anyone not use the SQE128 version?
!SQE128 with user pointer can easily be faster depending on the
ratio of requests that use SQE128 and don't. E.g. one PI read
following with a 100 of send/recv on average. copy_from_user
is not _that_ expensive and we're talking about zeroing an
extra never used afterwards cache line.
Though the main reason would be when you pass 2+ different
attributes and there is no space to put them in SQEs.
--
Pavel Begunkov