Re: Sending CQE to a different ring

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



It's not the branches I'm worried about, it's the growing of the request
to accomodate it, and the need to bring in another fd for this.
Maybe it's worth to pre-register fds of rings to which we can send CQEs similarly to pre-registering file fds? It would allow us to use u8 or u16 instead of u64 for identifying recipient ring.

But I guess I'm still a bit confused on what this will buy is. The
request is still being executed on the first ring (and hence the thread
associated with it), with the suggested approach here the only thing
you'd gain is the completion going somewhere else. Is this purely about
the post-processing that happens when a completion is posted to a given
ring?

As I wrote earlier, I am not familiar with internals of the io-uring implementation, so I am talking purely from user point of view. I will trust your judgment in regards of implementation complexity.

I guess, from user PoV, it does not matter on which ring the SQE will be executed. It can have certain performance implications, but otherwise it for user it's simply an implementation detail.

How did the original thread end up with the work to begin with? Was the
workload evenly distributed at that point, but later conditions (before
it get issued) mean that the situation has now changed and we'd prefer
to execute it somewhere else?

Let's talk about a concrete simplified example. Imagine a server which accepts from network commands to compute hash for a file with given path. The server executes the following algorithm:

1) Accept connection
2) Read command
3) Open file and create hasher state
4) Read chunk of data from file
5) If read data is not empty, update hasher state and go to step 4, else finalize hasher
6) Return the resulting hash and go to step 2

We have two places where we can balance load. First, after we accepted connection we should decide a ring which will process this connection. Second, during creation of SQE for step 4, if the current thread is overloaded, we can transfer task to a different thread.

The problem is that we can not predict how kernel will return read chunks. Even if we distributed SQEs evenly across rings, it's possible that kernel will return CQEs for a single ring in burst thus overloading it, while other threads will starve for events.

On a second thought, it looks like your solution with IORING_OP_WAKEUP_RING will have the following advantage: it will allow us to migrate task before execution of step 5 has started, while with my proposal we will be able to migrate tasks only on SQE creation (i.e. on step 4).

One idea... You issue the request as you normally would for ring1, and
you mark that request A with IOSQE_CQE_SKIP_SUCCESS. Then you link an
IORING_OP_WAKEUP_RING to request A, with the fd for it set to ring2, and
also mark that with IOSQE_CQE_SKIP_SUCCESS.

Looks interesting! I have forgot about linking and IOSQE_CQE_SKIP_SUCCESS.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux