On Fri, May 22, 2020 at 9:44 AM Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > śr., 20 maj 2020 o 23:23 Arnd Bergmann <arnd@xxxxxxxx> napisał(a): > > On Wed, May 20, 2020 at 7:35 PM Bartosz Golaszewski <brgl@xxxxxxxx> wrote: > > > śr., 20 maj 2020 o 16:37 Arnd Bergmann <arnd@xxxxxxxx> napisał(a): > > > My thinking was this: if I mask the relevant interrupt (TX/RX > > > complete) and ack it right away, the status bit will be asserted on > > > the next packet received/sent but the process won't get interrupted > > > and when I unmask it, it will fire right away and I won't have to > > > recheck the status register. I noticed that if I ack it at the end of > > > napi poll callback, I end up missing certain TX complete interrupts > > > and end up seeing a lot of retransmissions even if I reread the status > > > register. I'm not yet sure where this race happens. > > > > Right, I see. If you just ack at the end of the poll function, you need > > to check the rings again to ensure you did not miss an interrupt > > between checking observing both rings to be empty and the irq-ack. > > > > I suspect it's still cheaper to check the two rings with an uncached > > read from memory than to to do the read-modify-write on the mmio, > > but you'd have to measure that to be sure. > > > > Unfortunately the PHY on the board I have is 100Mbps which is the > limiting factor in benchmarking this driver. :( > > If you're fine with this - I'd like to fix the minor issues you > pointed out and stick with the current approach for now. We can always > fix the implementation in the future once a board with a Gigabit PHY > is out. Most ethernet drivers don't use such fine-grained interrupt > control anyway. I expect the performance differences to be miniscule > really. Ok, fair enough. The BQL limiting is the part that matters the most for performance on slow lines (preventing long latencies from buffer bloat), and you have that now. Arnd