Re: [RFC PATCH v4 13/16] io_uring: add io_recvzc request

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/16/24 16:59, Jens Axboe wrote:
On 3/15/24 5:52 PM, Pavel Begunkov wrote:
On 3/15/24 18:38, Jens Axboe wrote:
On 3/15/24 11:34 AM, Pavel Begunkov wrote:
On 3/14/24 16:14, Jens Axboe wrote:
[...]
@@ -1053,6 +1058,85 @@ struct io_zc_rx_ifq *io_zc_verify_sock(struct io_kiocb *req,
         return ifq;
     }
     +int io_recvzc_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+    struct io_recvzc *zc = io_kiocb_to_cmd(req, struct io_recvzc);
+
+    /* non-iopoll defer_taskrun only */
+    if (!req->ctx->task_complete)
+        return -EINVAL;

What's the reasoning behind this?

CQ locking, see the comment a couple lines below

My question here was more towards "is this something we want to do".
Maybe this is just a temporary work-around and it's nothing to discuss,
but I'm not sure we want to have opcodes only work on certain ring
setups.

I don't think it's that unreasonable restricting it. It's hard to
care about !DEFER_TASKRUN for net workloads, it makes CQE posting a bit

I think there's a distinction between "not reasonable to support because
it's complicated/impossible to do so", and "we prefer not to support
it". I agree, as a developer it's hard to care about !DEFER_TASKRUN for
networking workloads, but as a user, they will just setup a default
queue until they wise up. And maybe this can be a good thing in that

They'd still need to find a supported NIC and do all the other
setup, comparably to that it doesn't add much trouble. And my

Hopefully down the line, it'll work on more NICs,

I wouldn't hope all necessary features will be seen in consumer
cards

and configuration will be less of a nightmare than it is now.

I'm already assuming steering will be taken care by the kernel,
but you have to choose your nic, allocate an ifq, mmap a ring,
and then you're getting scattered chunks instead of

recv((void *)one_large_buffer);

My point is that it requires more involvement from user by design.
usual argument is that io_uring is a low-level api, it's expected
that people interacting with it directly are experienced enough,
expect to spend some time to make it right and likely library
devs.

Have you seen some of the code that has gone in to libraries for
io_uring support? I have, and I don't think that statement is true at
all for that side.

Well, some implementations are crappy, some are ok, some are
learning and improving what they have.


It should work out of the box even with a naive approach, while the best
approach may require some knowledge. At least I think that's the sanest
stance on that.

they'd be nudged toward DEFER_TASKRUN, but I can also see some head
scratching when something just returns (the worst of all error codes)
-EINVAL when they attempt to use it.

Yeah, we should try to find a better error code, and the check
should migrate to ifq registration.

Wasn't really a jab at the code in question, just more that -EINVAL is
the ubiqitious error code for all kinds of things and it's hard to
diagnose in general for a user. You just have to start guessing...

cleaner, and who knows where the single task part would become handy.

But you can still take advantage of single task, since you know if
that's going to be true or not. It just can't be unconditional.

Thinking about ifq termination, which should better cancel and wait
for all corresponding zc requests, it's should be easier without
parallel threads. E.g. what if another thread is in the enter syscall
using ifq, or running task_work and not cancellable. Then apart
from (non-atomic) refcounting, we'd need to somehow wait for it,
doing wake ups on the zc side, and so on.

I don't know, not seeing a lot of strong arguments for making it
DEFER_TASKRUN only. My worry is that once we starting doing that, then
more will follow. And honestly I think that would be a shame.

For ifq termination, surely these things are referenced, and termination
would need to wait for the last reference to drop? And if that isn't an
expected condition (it should not be), then a percpu ref would suffice.
Nobody cares if the teardown side is more expensive, as long as the fast
path is efficient.

You can solve any of that, it's true, the question how much crap
you'd need to add in hot paths and diffstat wise. Just take a look
at what a nice function io_recvmsg() is together with its helpers
like io_recvmsg_multishot().

That is true, and I guess my real question is "what would it look like
if we supported !DEFER_TASKRUN". Which I think is a valid question.

The biggest concern is optimisations and quirks that we can't
predict at the moment. DEFER_TASKRUN/SINGLE_ISSUER provide a simpler
model, I'd rather keep recvzc simple than having tens of conditional
optimisations with different execution flavours and contexts.
Especially, since it can be implemented later, wouldn't work the
other way around.

Yes me too, and I'd hate to have two variants just because of that. But
comparing to eg io_recv() and helpers, it's really not that bad. Hence
my question on how much would it take, and how nasty would it be, to
support !DEFER_TASKRUN.

It might look bearable... at first, but when it stops on that?
There will definitely be fixes and optimisations, whenever in my
mind it's something that is not even needed. I guess I'm too
traumatised by the amount of uapi binding features I wish I
could axe out and never see again.

--
Pavel Begunkov




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux