Re: [RFC] e1000e: Add delays after writing to registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 03, 2015 at 04:10:23PM -0600, Jonathan David wrote:
> On 11/03/2015 01:42 PM, Henrik Austad wrote:
> >On Tue, Nov 03, 2015 at 11:43:21AM -0600, Jonathan David wrote:
> >>On 10/22/2015 12:59 AM, Henrik Austad wrote:
> 
> >>>>Adding a delay after long series of writes gives them time to
> >>>>complete, and for higher priority tasks to run unimpeded.
> >>>
> >>>Aren't we running with threaded interrupts?
> >>>
> >>>What happens to the thread(s) pushing data to the network?
> >>>What about xmit-buffer once it is full? Which thread will block on send or
> >>>have its sk_buff dropped?
> >>
> >>All of this is totally irrelevant to the problem we are seeing.
> >
> >If this is irrelevant, why hack at the network-driver, hmm?
> 
> It is relevant to the network driver, as this is where the symptoms were
> discovered; however, it has no relation to the packet delivery path. This is
> related purely to link configuration.

I was under the impression that a PCI link configuration/training was down 
to speed etc, not how many MMIO read/writes it could do. Then again, a lot 
of this stuff is pure (black) magic.

> >>The e1000x driver itself is not responsible for the delay here.
> >
> >... then why hack the network-driver?
> 
> Lack of better known options.
> 
> >>The issue is with PCI where issuing a large number of MMIO writes
> >>followed by a read (to force said writes to execute) will stall the CPU.
> >>When the CPU is stalled, no interrupts are serviced, including the local
> >>apic timer interrupt, which was responsible for waking up cyclictest.
> >>This behavior was observed within traces gathered from cyclictest with
> >>ftrace enabled.
> >
> >So you get bogged down with interrupts disabled;
> 
> No, interrupts are entirely enabled while the PCI MMIO writes/read are
> issued; but the local apic timer still arrives late, presumably because the
> CPU is waiting to complete whatever writes remain in the buffer.

Heh, strange, is the interrupt signal itself delivered late as well, or 
just the handling of it?

> I think this might be the root of our miscommunication. You are asking good
> questions about threaded interrupts, etc, but it isn't clear how they are
> related to the specific problem we are seeing.

Perhaps a trace of the problem could be shared?

A full function-trace with irq-events and timer-events would be appreciated 
:)

-- 
Henrik Austad

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux