From: Gerhard Engleder <gerhard@xxxxxxxxxxxxxxxxxxxxx> Date: Sat, 14 Dec 2024 20:16:23 +0100 > From: Gerhard Engleder <eg@xxxxxxxx> > > Link down and up triggers update of MTA table. This update executes many > PCIe writes and a final flush. Thus, PCIe will be blocked until all > writes are flushed. As a result, DMA transfers of other targets suffer > from delay in the range of 50us. This results in timing violations on > real-time systems during link down and up of e1000e in combination with > an Intel i3-2310E Sandy Bridge CPU. > > The i3-2310E is quite old. Launched 2011 by Intel but still in use as > robot controller. The exact root cause of the problem is unclear and > this situation won't change as Intel support for this CPU has ended > years ago. Our experience is that the number of posted PCIe writes needs > to be limited at least for real-time systems. With posted PCIe writes a > much higher throughput can be generated than with PCIe reads which > cannot be posted. Thus, the load on the interconnect is much higher. > Additionally, a PCIe read waits until all posted PCIe writes are done. > Therefore, the PCIe read can block the CPU for much more than 10us if a > lot of PCIe writes were posted before. Both issues are the reason why we > are limiting the number of posted PCIe writes in row in general for our > real-time systems, not only for this driver. > > A flush after a low enough number of posted PCIe writes eliminates the > delay but also increases the time needed for MTA table update. The > following measurements were done on i3-2310E with e1000e for 128 MTA > table entries: > > Single flush after all writes: 106us > Flush after every write: 429us > Flush after every 2nd write: 266us > Flush after every 4th write: 180us > Flush after every 8th write: 141us > Flush after every 16th write: 121us > > A flush after every 8th write delays the link up by 35us and the > negative impact to DMA transfers of other targets is still tolerable. > > Execute a flush after every 8th write. This prevents overloading the > interconnect with posted writes. > > Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@xxxxxxxxx> > CC: Vitaly Lifshits <vitaly.lifshits@xxxxxxxxx> > Link: https://lore.kernel.org/netdev/f8fe665a-5e6c-4f95-b47a-2f3281aa0e6c@xxxxxxx/T/ > Signed-off-by: Gerhard Engleder <eg@xxxxxxxx> > --- > v3: > - mention problematic platform explicitly (Bjorn Helgaas) > - improve comment (Paul Menzel) > > v2: > - remove PREEMPT_RT dependency (Andrew Lunn, Przemek Kitszel) > --- > drivers/net/ethernet/intel/e1000e/mac.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/mac.c b/drivers/net/ethernet/intel/e1000e/mac.c > index d7df2a0ed629..0174c16bbb43 100644 > --- a/drivers/net/ethernet/intel/e1000e/mac.c > +++ b/drivers/net/ethernet/intel/e1000e/mac.c > @@ -331,8 +331,15 @@ void e1000e_update_mc_addr_list_generic(struct e1000_hw *hw, > } > > /* replace the entire MTA table */ > - for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) > + for (i = hw->mac.mta_reg_count - 1; i >= 0; i--) { > E1000_WRITE_REG_ARRAY(hw, E1000_MTA, i, hw->mac.mta_shadow[i]); > + > + /* do not queue up too many posted writes to prevent increased > + * latency for other devices on the interconnect > + */ I think a multi-line comment should start with a capital letter and have a '.' at the end of the sentence. + netdev code doesn't have the special rule for multi-line comments, they should look the same way as in the rest of the kernel: /* * Do not queue up ... * latency ... */ > + if ((i % 8) == 0 && i != 0) > + e1e_flush(); IIRC explicit `== 0` / `!= 0` are considered redundant. if (!(i % 8) && i) I'd also mention in the comment above that this means "flush each 8th write" and why exactly 8. > + } > e1e_flush(); > } Thanks, Olek