On Thu, Aug 22, 2019 at 01:21:09PM +0200, Florian Westphal wrote: > commit e4db5b61c572475bbbcf63e3c8a2606bfccf2c9d upstream. > > Kristian Evensen says: > In a project I am involved in, we are running ipsec (Strongswan) on > different mt7621-based routers. Each router is configured as an > initiator and has around ~30 tunnels to different responders (running > on misc. devices). Before the flow cache was removed (kernel 4.9), we > got a combined throughput of around 70Mbit/s for all tunnels on one > router. However, we recently switched to kernel 4.14 (4.14.48), and > the total throughput is somewhere around 57Mbit/s (best-case). I.e., a > drop of around 20%. Reverting the flow cache removal restores, as > expected, performance levels to that of kernel 4.9. > > When pcpu xdst exists, it has to be validated first before it can be > used. > > A negative hit thus increases cost vs. no-cache. > > As number of tunnels increases, hit rate decreases so this pcpu caching > isn't a viable strategy. > > Furthermore, the xdst cache also needs to run with BH off, so when > removing this the bh disable/enable pairs can be removed too. > > Kristian tested a 4.14.y backport of this change and reported > increased performance: > > In our tests, the throughput reduction has been reduced from around -20% > to -5%. We also see that the overall throughput is independent of the > number of tunnels, while before the throughput was reduced as the number > of tunnels increased. > > Reported-by: Kristian Evensen <kristian.evensen@xxxxxxxxx> > Signed-off-by: Florian Westphal <fw@xxxxxxxxx> > Signed-off-by: Steffen Klassert <steffen.klassert@xxxxxxxxxxx> > --- > Vakul Garg reports traffic going via ipsec tunnels will cause the kernel > to spin in an infinite loop due to xfrm policy reference count > overflowing and becoming 0. > The refcount leak is in the pcpu cache. Instead of fixing this, just > remove the pcpu cache -- its not present in any other stable release. > Vakul reported that this patch fixes the problem. > > There are no major deviations from the upstream revert; conflicts > were only due to context. Now queued up, does 4.9.y also need this? thanks, greg k-h