Hi, On Fri, Nov 13, 2020 at 9:50 AM Kristian Evensen <kristian.evensen@xxxxxxxxx> wrote: > Yes, you are right in that NAT can have a large effect on performance, > especially when you start being CPU-limited. However,when using perf > to profile the kernel during my tests, no function related to > netfilter/conntrack appeared very high on the list. I would also > expect the modem to at least reach the performance of the dongle, with > offloading being switched off. However, there could be some detail I > missed. I continued working on this issue today and I believe I have found at least one reason for my performance problems. My initial attempts at profiling resulted in quite noisy perf files and this caused me to look in the wrong places. Today I figured out how to get a cleaner file, and I noticed that a lot of resources were spent on pskb_expand_head() + support functions. My MT7621 devices are used as routers, so before the packets are sent out on the LAN additional headers have to be added. The current code in qmimux_rx_fixup() allocates an SKB for each aggregated packet and copies the data from the URB. The newly allocated SKB has too little headroom, so when we get to ip_forward() then the check in skb_cow() fails and the SKB is reallocated. After increasing the amount of data allocated to also include the required headroom + reserving headroom amount of bytes, I see a huge performance increase. I go from around 230 Mbit/s and to 280Mbit/s, with significantly less CPU usage. 280 Mbit/s is the same speed as I get from my phone connected to the same network, so it seems to be the max of the network right now. I do not know what would be an acceptable way (if any) to get this fix upstreamed. I currently add an additional "safe" amount of data, but I am pretty sure ETH_HLEN + 2 is not an acceptable solution :) Kristian