Re: [PATCH v2 2/2] net: ethernet: ti: am65-cpsw: Add minimal XDP support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/5/24 14:28, Andrew Lunn wrote:
On Tue, Mar 05, 2024 at 11:46:00AM +0100, Julien Panis wrote:
On 3/1/24 17:38, Andrew Lunn wrote:
On Fri, Mar 01, 2024 at 04:02:53PM +0100, Julien Panis wrote:
This patch adds XDP (eXpress Data Path) support to TI AM65 CPSW
Ethernet driver. The following features are implemented:
- NETDEV_XDP_ACT_BASIC (XDP_PASS, XDP_TX, XDP_DROP, XDP_ABORTED)
- NETDEV_XDP_ACT_REDIRECT (XDP_REDIRECT)
- NETDEV_XDP_ACT_NDO_XMIT (ndo_xdp_xmit callback)

The page pool memory model is used to get better performance.
Do you have any benchmark numbers? It should help with none XDP
traffic as well. So maybe iperf numbers before and after?

	Andrew
Argh...Houston, we have a problem. I checked my v3, which is ready for
submission, with iperf3:
1) Before = without page pool -> 500 MBits/sec
2) After = with page pool -> 442 MBits/sec
-> ~ 10% worse with page pool here.

Unless the difference is not due to page pool. Maybe there's something else
which is not good in my patch. I'm going to send the v3 which uses page pool,
hopefully someone will find out something suspicious. Meanwhile, I'll carry on
investigating: I'll check the results with my patch, by removing only the using of
page pool.
You can also go the other way. First add page pool support. For the
FEC, that improved its performance. Then add XDP, which i think
decreased the performance a little. It is extra processing in the hot
path, so a little loss is not unsurprising.

What tends to be expensive with ARM is cache invalidation and
flush. So make sure you have the lengths correct. You don't want to
operate on more memory than necessary. No point flushing the full MTU
for a 64 byte TCP ACK, etc.

       Andrew

I changed back code step by step and could find what makes a significant
difference. Here are the main tests achieved (results in Mbits/sec):

1) Page pool without XDP code -> res = 442
Conclusion: No difference with or without XDP code.

2) From 1), page pool removed and replaced by previous memory model
based on dev_alloc_page() function -> res =418
Conclusion: Your advice was good, that's better with page pool. :)

3) From 2), am65_cpsw_alloc_skb() function removed and replaced by
netdev_alloc_skb_ip_align(), as used by the driver before -> res = 506
Conclusion: Here is where the loss comes from.
IOW, My am65_cpsw_alloc_skb() function is not good.

Initially, I mainly created this 'custom' am65_cpsw_alloc_skb() function
because I thought that none of XDP memory models could be used along
with netdev_alloc_skb_ip_align() function. Was I wrong ?
By creating this custom am65_cpsw_alloc_skb(), I also wanted to handle
the way headroom is reserved differently.

Julien





[Index of Archives]     [Linux Input]     [Video for Linux]     [Gstreamer Embedded]     [Mplayer Users]     [Linux USB Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]

  Powered by Linux