On Fri, Sep 13, 2024 at 10:42:37AM -0700, Stanislav Fomichev wrote: > On 09/12, Joe Damato wrote: [...] > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6493,6 +6493,18 @@ EXPORT_SYMBOL(napi_busy_loop); > > > > #endif /* CONFIG_NET_RX_BUSY_POLL */ > > > > +static void napi_hash_add_with_id(struct napi_struct *napi, unsigned int napi_id) > > +{ > > + spin_lock(&napi_hash_lock); > > + > > + napi->napi_id = napi_id; > > + > > + hlist_add_head_rcu(&napi->napi_hash_node, > > + &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]); > > + > > + spin_unlock(&napi_hash_lock); > > +} > > + > > static void napi_hash_add(struct napi_struct *napi) > > { > > if (test_bit(NAPI_STATE_NO_BUSY_POLL, &napi->state)) > > @@ -6505,12 +6517,13 @@ static void napi_hash_add(struct napi_struct *napi) > > if (unlikely(++napi_gen_id < MIN_NAPI_ID)) > > napi_gen_id = MIN_NAPI_ID; > > } while (napi_by_id(napi_gen_id)); > > [..] > > > - napi->napi_id = napi_gen_id; > > - > > - hlist_add_head_rcu(&napi->napi_hash_node, > > - &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]); > > > > spin_unlock(&napi_hash_lock); > > + > > + napi_hash_add_with_id(napi, napi_gen_id); > > nit: it is very unlikely that napi_gen_id is gonna wrap around after the > spin_unlock above, but maybe it's safer to have the following? > > static void __napi_hash_add_with_id(struct napi_struct *napi, unsigned int napi_id) > { > napi->napi_id = napi_id; > hlist_add_head_rcu(&napi->napi_hash_node, > &napi_hash[napi->napi_id % HASH_SIZE(napi_hash)]); > } > > static void napi_hash_add_with_id(struct napi_struct *napi, unsigned int napi_id) > { > spin_lock(&napi_hash_lock); > __napi_hash_add_with_id(...); > spin_unlock(&napi_hash_lock); > } > > And use __napi_hash_add_with_id here before spin_unlock? After making this change and re-testing on a couple reboots, I haven't been able to reproduce the page pool issue I mentioned in the other email [1]. Not sure if I've just been... "getting lucky" or if this fixed something that I won't fully grasp until I read the mlx5 source again. Will test it a few more times, though. [1]: https://lore.kernel.org/netdev/ZuMC2fYPPtWggB2w@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/