On 10/9/24 17:13, Joe Damato wrote:
On Wed, Oct 09, 2024 at 04:09:53PM +0100, Pavel Begunkov wrote:
On 10/8/24 23:25, Joe Damato wrote:
On Mon, Oct 07, 2024 at 03:15:56PM -0700, David Wei wrote:
From: Pavel Begunkov <asml.silence@xxxxxxxxx>
[...]
However, from time to time we need to synchronise with the napi, for
example to add more user memory or allocate fallback buffers. Add a
helper function napi_execute that allows to run a custom callback from
under napi context so that it can access and modify napi protected
parts of io_uring. It works similar to busy polling and stops napi from
running in the meantime, so it's supposed to be a slow control path.
Signed-off-by: Pavel Begunkov <asml.silence@xxxxxxxxx>
Signed-off-by: David Wei <dw@xxxxxxxxxxx>
[...]
diff --git a/net/core/dev.c b/net/core/dev.c
index 1e740faf9e78..ba2f43cf5517 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6497,6 +6497,59 @@ void napi_busy_loop(unsigned int napi_id,
}
EXPORT_SYMBOL(napi_busy_loop);
+void napi_execute(unsigned napi_id,
+ void (*cb)(void *), void *cb_arg)
+{
+ struct napi_struct *napi;
+ bool done = false;
+ unsigned long val;
+ void *have_poll_lock = NULL;
+
+ rcu_read_lock();
+
+ napi = napi_by_id(napi_id);
+ if (!napi) {
+ rcu_read_unlock();
+ return;
+ }
+
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+ preempt_disable();
+ for (;;) {
+ local_bh_disable();
+ val = READ_ONCE(napi->state);
+
+ /* If multiple threads are competing for this napi,
+ * we avoid dirtying napi->state as much as we can.
+ */
+ if (val & (NAPIF_STATE_DISABLE | NAPIF_STATE_SCHED |
+ NAPIF_STATE_IN_BUSY_POLL))
+ goto restart;
+
+ if (cmpxchg(&napi->state, val,
+ val | NAPIF_STATE_IN_BUSY_POLL |
+ NAPIF_STATE_SCHED) != val)
+ goto restart;
+
+ have_poll_lock = netpoll_poll_lock(napi);
+ cb(cb_arg);
A lot of the above code seems quite similar to __napi_busy_loop, as
you mentioned.
It might be too painful, but I can't help but wonder if there's a
way to refactor this to use common helpers or something?
I had been thinking that the napi->state check /
cmpxchg could maybe be refactored to avoid being repeated in both
places?
Yep, I can add a helper for that, but I'm not sure how to
deduplicate it further while trying not to pollute the
napi polling path.
It was just a minor nit; I wouldn't want to hold back this important
work just for that.
I'm still looking at the code myself to see if I can see a better
arrangement of the code.
But that could always come later as a cleanup for -next ?
It's still early, there will be a v6 anyway. And thanks for
taking a look.
--
Pavel Begunkov