Re: Profiling XDP programs for performance issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 9 Apr 2021 08:40:51 +0200
Magnus Karlsson <magnus.karlsson@xxxxxxxxx> wrote:

> On Fri, Apr 9, 2021 at 1:06 AM Neal Shukla <nshukla@xxxxxxxxxxxxx> wrote:
> >
> > Using perf, we've confirmed that the line mentioned has a 25.58% cache miss
> > rate.  
> 
> Do these hit in the LLC or in DRAM? In any case, your best bet is
> likely to prefetch this into your L1/L2. In my experience, the best
> way to do this is not to use an explicit prefetch instruction but to
> touch/fetch the cache lines you need in the beginning of your
> computation and let the fetch latency and the usage of the first cache
> line hide the latencies of fetching the others. In your case, touch
> both metadata and packet at the same time. Work with the metadata and
> other things then come back to the packet data and hopefully the
> relevant part will reside in the cache or registers by now. If that
> does not work, touch packet number N+1 just before starting with
> packet N.
> 
> Very general recommendations but hope it helps anyway. How exactly to
> do this efficiently is very application dependent.

I see you use driver i40e and that driver does a net_prefetch(xdp->data)
*AFTER* the XDP hook.  Thus, that could explain why you are seeing this.

Can you try the patch below, and see if it solves your observed issue?

> > On Thu, Apr 8, 2021 at 2:38 PM Zvi Effron <zeffron@xxxxxxxxxxxxx> wrote:  
> > >
> > > Apologies for the spam to anyone who received my first response, but
> > > it was accidentally sent as HTML and rejected by the mailing list.
> > >
> > > On Thu, Apr 8, 2021 at 11:20 AM Neal Shukla <nshukla@xxxxxxxxxxxxx> wrote:  
> > > >
> > > > System Info:
> > > > CPU: Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz
> > > > Network Adapter/NIC: Intel X710
> > > > Driver: i40e
> > > > Kernel version: 5.8.15
> > > > OS: Fedora 33
> > > >  
> > >
> > > Slight correction, we're actually on the 5.10.10 kernel.

[PATCH] i40e: Move net_prefetch to benefit XDP

From: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>

DEBUG PATCH WITH XXX comments

Signed-off-by: Jesper Dangaard Brouer <brouer@xxxxxxxxxx>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index e398b8ac2a85..c09b8a5e6a2a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2121,7 +2121,7 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring,
 	struct sk_buff *skb;
 
 	/* prefetch first cache line of first page */
-	net_prefetch(xdp->data);
+	net_prefetch(xdp->data); // XXX: Too late for XDP
 
 	/* Note, we get here by enabling legacy-rx via:
 	 *
@@ -2205,7 +2205,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 	 * likely have a consumer accessing first few bytes of meta
 	 * data, and then actual data.
 	 */
-	net_prefetch(xdp->data_meta);
+//	net_prefetch(xdp->data_meta); //XXX: too late for XDP
 
 	/* build an skb around the page buffer */
 	skb = build_skb(xdp->data_hard_start, truesize);
@@ -2513,6 +2513,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 			/* At larger PAGE_SIZE, frame_sz depend on len size */
 			xdp.frame_sz = i40e_rx_frame_truesize(rx_ring, size);
 #endif
+			net_prefetch(xdp->data);
 			skb = i40e_run_xdp(rx_ring, &xdp);
 		}
 
@@ -2530,6 +2531,7 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget)
 		} else if (skb) {
 			i40e_add_rx_frag(rx_ring, rx_buffer, skb, size);
 		} else if (ring_uses_build_skb(rx_ring)) {
+			// XXX: net_prefetch called after i40e_run_xdp()
 			skb = i40e_build_skb(rx_ring, rx_buffer, &xdp);
 		} else {
 			skb = i40e_construct_skb(rx_ring, rx_buffer, &xdp);




[Index of Archives]     [Linux Networking Development]     [Fedora Linux Users]     [Linux SCTP]     [DCCP]     [Gimp]     [Yosemite Campsites]

  Powered by Linux