On Thu, Jul 25, 2024 at 06:38:58AM -0700, Jakub Kicinski wrote: > On Wed, 24 Jul 2024 17:49:12 +0200 Maciej Fijalkowski wrote: > > > So if we are already in the af_xdp handler, and update patch sets pool > > > to NULL - the af_xdp handler will be fine with the pool becoming NULL? > > > I guess it may be fine, it's just quite odd to call the function called > > > _ONCE() multiple times.. > > > > Update path before NULLing pool will go through rcu grace period, stop > > napis, disable irqs, etc. Running napi won't be exposed to nulled pool in > > such case. > > Could you make it clearer what condition the patch is fixing, then? > What can go wrong without this patch? Sorry for confusion, but without this patch scenario you brought up initially *could* happen, under some wild circumstances. When I was responding yesterday my head was around the code with this particular patch in place, that's why I said such pool state transistion was not possible. Updater does this (prior to this patch): (...) ring->xsk_pool = ice_get_xp_from_qid(vsi, qid); // set to NULL (...) ice_qvec_toggle_napi(vsi, q_vector, true); ice_qvec_ena_irq(vsi, q_vector); In theory compiler is allowed to transform the code in a way that xsk_pool assignment will happen *after* triggering napi. So in ice_napi_poll(): if (tx_ring->xsk_pool) wd = ice_xmit_zc(tx_ring); // call ZC routine else if (ice_ring_is_xdp(tx_ring)) wd = true; else wd = ice_clean_tx_irq(tx_ring, budget); You will initiate ZC Tx processing because xsk_pool ptr was still valid and crash in the middle of its job once it's finally NULLed. To prevent that: updater: (...) WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid)); (...) ice_qvec_toggle_napi(vsi, q_vector, true); ice_qvec_ena_irq(vsi, q_vector); /* make sure NAPI sees updated ice_{t,x}_ring::xsk_pool */ synchronize_net(); reader: if (READ_ONCE(tx_ring->xsk_pool)) wd = ice_xmit_zc(tx_ring); else if (ice_ring_is_xdp(tx_ring)) wd = true; else wd = ice_clean_tx_irq(tx_ring, budget); Does that make any sense now?