Re: [PATCH bpf-next 00/10] xsk: stop softirq processing on full XSK Rx queue

Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> · Mon, 11 Apr 2022 17:46:06 +0200

On Fri, Apr 08, 2022 at 11:17:56AM -0700, Jakub Kicinski wrote:
> On Fri, 8 Apr 2022 15:48:44 +0300 Maxim Mikityanskiy wrote:
> > >> 4. A slow or malicious AF_XDP application may easily cause an overflow of
> > >> the hardware receive ring. Your feature introduces a mechanism to pause the
> > >> driver while the congestion is on the application side, but no symmetric
> > >> mechanism to pause the application when the driver is close to an overflow.
> > >> I don't know the behavior of Intel NICs on overflow, but in our NICs it's
> > >> considered a critical error, that is followed by a recovery procedure, so
> > >> it's not something that should happen under normal workloads.  
> > > 
> > > I'm not sure I follow on this one. Feature is about overflowing the XSK
> > > receive ring, not the HW one, right?  
> > 
> > Right. So we have this pipeline of buffers:
> > 
> > NIC--> [HW RX ring] --NAPI--> [XSK RX ring] --app--> consumes packets
> > 
> > Currently, when the NIC puts stuff in HW RX ring, NAPI always runs and 
> > drains it either to XSK RX ring or to /dev/null if XSK RX ring is full. 
> > The driver fulfills its responsibility to prevent overflows of HW RX 
> > ring. If the application doesn't consume quick enough, the frames will 
> > be leaked, but it's only the application's issue, the driver stays 
> > consistent.
> > 
> > After the feature, it's possible to pause NAPI from the userspace 
> > application, effectively disrupting the driver's consistency. I don't 
> > think an XSK application should have this power.
> 
> +1
> cover letter refers to busy poll, but did that test enable prefer busy
> poll w/ the timeout configured right? It seems like similar goal can 
> be achieved with just that.

AF_XDP busy poll where app and driver runs on same core, without
configuring gro_flush_timeout and napi_defer_hard_irqs does not bring much
value, so all of the busy poll tests were done with:

echo 2 | sudo tee /sys/class/net/ens4f1/napi_defer_hard_irqs
echo 200000 | sudo tee /sys/class/net/ens4f1/gro_flush_timeout

That said, performance can still suffer and packets would not make it up
to user space even with timeout being configured in the case I'm trying to
improve.