On Wed, 2022-03-02 at 22:40 -0800, Jakub Kicinski wrote: > On Mon, 28 Feb 2022 12:33:28 -0600 Robert Hancock wrote: > > There is an oddity in the way the RSR register flags propagate to the > > ISR register (and the actual interrupt output) on this hardware: it > > appears that RSR register bits only result in ISR being asserted if the > > interrupt was actually enabled at the time, so enabling interrupts with > > RSR bits already set doesn't trigger an interrupt to be raised. There > > was already a partial fix for this race in the macb_poll function where > > it checked for RSR bits being set and re-triggered NAPI receive. > > However, there was a still a race window between checking RSR and > > actually enabling interrupts, where a lost wakeup could happen. It's > > necessary to check again after enabling interrupts to see if RSR was set > > just prior to the interrupt being enabled, and re-trigger receive in that > > case. > > > > This issue was noticed in a point-to-point UDP request-response protocol > > which periodically saw timeouts or abnormally high response times due to > > received packets not being processed in a timely fashion. In many > > applications, more packets arriving, including TCP retransmissions, would > > cause the original packet to be processed, thus masking the issue. > > > > Also change from using napi_reschedule to napi_schedule, as the only > > difference is the presence of a return value which wasn't used here > > anyway. > > Let's leave that out from this particular patch - fixes should be > minimal, this sounds like cleanup. Can do. > > > Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is > > done") > > Cc: stable@xxxxxxxxxxxxxxx > > Co-developed-by: Scott McNutt <scott.mcnutt@xxxxxxxxxxxx> > > Signed-off-by: Scott McNutt <scott.mcnutt@xxxxxxxxxxxx> > > Signed-off-by: Robert Hancock <robert.hancock@xxxxxxxxxx> > > --- > > drivers/net/ethernet/cadence/macb_main.c | 26 ++++++++++++++++++++++-- > > 1 file changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/ethernet/cadence/macb_main.c > > b/drivers/net/ethernet/cadence/macb_main.c > > index 98498a76ae16..338660fe1d93 100644 > > --- a/drivers/net/ethernet/cadence/macb_main.c > > +++ b/drivers/net/ethernet/cadence/macb_main.c > > @@ -1573,14 +1573,36 @@ static int macb_poll(struct napi_struct *napi, int > > budget) > > if (work_done < budget) { > > napi_complete_done(napi, work_done); > > > > - /* Packets received while interrupts were disabled */ > > + /* RSR bits only seem to propagate to raise interrupts when > > + * interrupts are enabled at the time, so if bits are already > > + * set due to packets received while interrupts were disabled, > > + * they will not cause another interrupt to be generated when > > + * interrupts are re-enabled. > > + * Check for this case here. > > + */ > > status = macb_readl(bp, RSR); > > Which case is more likely - status == 0 or != 0? > > Because MMIO reads are usually expensive so if status is likely > to be zero your other suggestion could be lower overhead. > It'd be good to mention this expectation in the commit message > or comment here. There was some measurement done on this that motivated a previous patch in this area: commit 504ad98df3a6b027ce997ca8f620e949cafb151f Author: Soren Brinkmann <soren.brinkmann@xxxxxxxxxx> Date: Sun May 4 15:43:01 2014 -0700 net: macb: Remove 'unlikely' optimization Coverage data suggests that the unlikely case of receiving data while the receive handler is running may not be that unlikely. Coverage data after running iperf for a while: 91320: 891: work_done = bp->macbgem_ops.mog_rx(bp, budget); 91320: 892: if (work_done < budget) { 2362: 893: napi_complete(napi); -: 894: -: 895: /* Packets received while interrupts were disabled */ 4724: 896: status = macb_readl(bp, RSR); 2362: 897: if (unlikely(status)) { 762: 898: if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE) 762: 899: macb_writel(bp, ISR, MACB_BIT(RCOMP)); -: 900: napi_reschedule(napi); -: 901: } else { 1600: 902: macb_writel(bp, IER, MACB_RX_INT_FLAGS); -: 903: } -: 904: } Signed-off-by: Soren Brinkmann <soren.brinkmann@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> So it looks like the non-zero status case was being hit roughly 1/3 of the time, at least under that particular workload. It may depend heavily on workload etc. but doesn't seem to be clear-cut to optimize one way or the other. For the new "double check" branch, from adding debug in, it appears that one is hit on the order of a few dozen times a day under constant load, so the "unlikely" seems appropriate there. > > > if (status) { > > if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE) > > queue_writel(queue, ISR, MACB_BIT(RCOMP)); > > - napi_reschedule(napi); > > + napi_schedule(napi); > > } else { > > queue_writel(queue, IER, bp->rx_intr_mask); > > + > > + /* Packets could have been received in the window > > + * between the check above and re-enabling interrupts. > > + * Therefore, a double-check is required to avoid > > + * losing a wakeup. This can potentially race with > > + * the interrupt handler doing the same actions if an > > + * interrupt is raised just after enabling them, but > > + * this should be harmless. > > + */ > > + status = macb_readl(bp, RSR); > > + if (unlikely(status)) { > > + queue_writel(queue, IDR, bp->rx_intr_mask); > > + if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE) > > + queue_writel(queue, ISR, > > MACB_BIT(RCOMP)); > > + napi_schedule(napi); > > + } > > } > > } > >