Magnus Karlsson <magnus.karlsson@xxxxxxxxx> writes: > Make sure that xdp_do_flush() is always executed before > napi_complete_done(). This is important for two reasons. First, a > redirect to an XSKMAP assumes that a call to xdp_do_redirect() from > napi context X on CPU Y will be follwed by a xdp_do_flush() from the Typo in 'followed' here (and in all the copy-pasted commit messages). > same napi context and CPU. This is not guaranteed if the > napi_complete_done() is executed before xdp_do_flush(), as it tells > the napi logic that it is fine to schedule napi context X on another > CPU. Details from a production system triggering this bug using the > veth driver can be found in [1]. > > The second reason is that the XDP_REDIRECT logic in itself relies on > being inside a single NAPI instance through to the xdp_do_flush() call > for RCU protection of all in-kernel data structures. Details can be > found in [2]. > > The drivers have only been compile-tested since I do not own any of > the HW below. So if you are a manintainer, please make sure I did not And another typo in 'maintainer' here. > mess something up. This is a lousy excuse for virtio-net though, but > it should be much simpler for the vitio-net maintainers to test this, > than me trying to find test cases, validation suites, instantiating a > good setup, etc. Michael and Jason can likely do this in minutes. > > Note that these were the drivers I found that violated the ordering by > running a simple script and manually checking the ones that came up as > potential offenders. But the script was not perfect in any way. There > might still be offenders out there, since the script can generate > false negatives. > > [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@xxxxxxxxxxxxxx > [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@xxxxxxxxxx/ Otherwise LGTM! For the series: Acked-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>