On Fri, 8 Apr 2022 15:48:44 +0300 Maxim Mikityanskiy wrote: > >> 4. A slow or malicious AF_XDP application may easily cause an overflow of > >> the hardware receive ring. Your feature introduces a mechanism to pause the > >> driver while the congestion is on the application side, but no symmetric > >> mechanism to pause the application when the driver is close to an overflow. > >> I don't know the behavior of Intel NICs on overflow, but in our NICs it's > >> considered a critical error, that is followed by a recovery procedure, so > >> it's not something that should happen under normal workloads. > > > > I'm not sure I follow on this one. Feature is about overflowing the XSK > > receive ring, not the HW one, right? > > Right. So we have this pipeline of buffers: > > NIC--> [HW RX ring] --NAPI--> [XSK RX ring] --app--> consumes packets > > Currently, when the NIC puts stuff in HW RX ring, NAPI always runs and > drains it either to XSK RX ring or to /dev/null if XSK RX ring is full. > The driver fulfills its responsibility to prevent overflows of HW RX > ring. If the application doesn't consume quick enough, the frames will > be leaked, but it's only the application's issue, the driver stays > consistent. > > After the feature, it's possible to pause NAPI from the userspace > application, effectively disrupting the driver's consistency. I don't > think an XSK application should have this power. +1 cover letter refers to busy poll, but did that test enable prefer busy poll w/ the timeout configured right? It seems like similar goal can be achieved with just that.