Re: [RFC] e1000e: Add delays after writing to registers

Henrik Austad <henrik@xxxxxxxxx> · Tue, 3 Nov 2015 20:42:46 +0100

On Tue, Nov 03, 2015 at 11:43:21AM -0600, Jonathan David wrote:
> On 10/22/2015 12:59 AM, Henrik Austad wrote:
> >On Wed, Oct 21, 2015 at 05:07:48PM -0500, Jonathan David wrote:
> >>There is a noticeable impact on determinism when a large number of
> >>writes are flushed. Writes to the hardware registers are sent across
> >>the PCI bus and take a significant amount of time to complete after
> >>a flush, which causes high priority tasks (including interrupts) to
> >>be delayed.
> >
> >Do you see this in the entire system, or on the core where the write was
> >triggered?
> 
> Only on the core where the writes are issued.

Ok, not a global freeze then :)

> >>Adding a delay after long series of writes gives them time to
> >>complete, and for higher priority tasks to run unimpeded.
> >
> >Aren't we running with threaded interrupts?
> >
> >What happens to the thread(s) pushing data to the network?
> >What about xmit-buffer once it is full? Which thread will block on send or
> >have its sk_buff dropped?
> 
> All of this is totally irrelevant to the problem we are seeing.

If this is irrelevant, why hack at the network-driver, hmm?

> The e1000x driver itself is not responsible for the delay here. 

... then why hack the network-driver?

> The issue is with PCI where issuing a large number of MMIO writes 
> followed by a read (to force said writes to execute) will stall the CPU. 
> When the CPU is stalled, no interrupts are serviced, including the local 
> apic timer interrupt, which was responsible for waking up cyclictest. 
> This behavior was observed within traces gathered from cyclictest with 
> ftrace enabled.

So you get bogged down with interrupts disabled; so back to my question, 
are we not running with threaded interrupts? What exactly are we triggering 
here?

> >I'm not sure if adding random delay and giving an unpredictable impact on
> >completely random threads is the best way to solve this..
> 
> Agreed, we know that this is a hack. Do you have any better solutions?

- Send less data over the network ;)
- affine network interrupts to a non-rt core and place rt-tasks on cores 
  shielded from interrupts (why are you not doing this already?)
- Look at the PCI driver and add breathers there (e.g. after X MMIO 
  writes, enable interrupts, disable interrupts and continue)

I'm sure adding random delay to the network-driver solves your particular 
problem, and that is fine, but I fear that you'll just end up opening a 
nasty can of worms. Debugging this when (not if) it fails is not going to 
be very pretty methinks.

Anyhow, I'm not about to tell you what you can or cannot do, just that I 
see some rather nasty bumps in the road ahead of you with the current 
solution.

-- 
Henrik Austad
Attachment:
signature.asc

Description: Digital signature