Re: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday 23 September 2016 11:40:46, Koehrer Mathias wrote:
> Hi Sebastian,
> 
> 
> > thanks for the feedback.
> > 
> > > > I run the cyclictest with the following options:
> > > > # cyclictest -a -i 100 -d 10 -m -n -t -p 80
> > >
> > >
> > >
> > > there is -S. And then 100 might be a little tight.
> > >
> > >
> > >
> > > > Of course the 2 minutes run-time of cyclictest is only a rough first
> > > > estimate.
> >
> > >
> > >
> > > and with no load…
> > >
> > >
> > >
> > > > Once I configure one of the i350 ports # ifconfig eth2 up
> > > > 192.168.100.100 the cyclictest shows directly and reproducibly
> > > > significant larger max latency values (40 microseconds, using the
> > > > same
> > > 
> > > conditions).
> > > 
> > > >
> > > >
> > > >
> > > > I did the very same test with kernel version 3.18.27-rt27.
> > > > With that version I did not see anything like that.
> > > >
> > > >
> > > >
> > > > Also, only the igb driver seems to cause the trouble. I have also an
> > > > e1000e based NIC in this PC and the usage of this driver does not
> > > > add any
> > > 
> > > significant latency.
> > > 
> > > >
> > > >
> > > > Any idea on this?
> > >
> > >
> > >
> > > Does this also happen if you have the NIC up and you plug in / out the
> > > cable? There are two things that come to mind:
> > > 
> > >   https://lkml.kernel.org/r/1445465268-10347-1-git-send-email-> > > 
> > > jonathan.david@xxxxxx
> > >
> > >
> > >
> > > https://lkml.kernel.org/r/1445886895-3692-1-git-send-email-joshc@xxxxx
> > > m
> > 
> > 
> > This happens even if I have done "ifconfig up" on the NIC without having a
> > cable
 plugged in.
> > Also, it happens if I have a cable plugged in and the link is up but no
> > traffic is running
 via this NIC port.
> > It looks as if solely the configured NIC port is causing the additional
> > latency, no
 matter if traffic is flowing via this NIC or not and no
> > matter if the link is up or not. 
> > I did the same test with the kernel/rt_preempt patch versions
> > 4.1.33-rt37 and 4.4.19-rt27, they show the very same behavior.
> > In opposite to that, the version 3.18.27-rt27 is working stable!
> > 
> > As mentioned before, the "igb" driver is causing the issue. The "e1000e"
> > driver works
 fine.
> > 
> 
> I did some further analysis.
> The code that is causing the long latencies seems to be the 
> function "igb_watchdog_task" within igb_main.c (Line: 4386). 
> This function will be called periodically.
> When I do a return at the beginning of this function the additional latency
> is not seen.
 In particular that function calls "igb_has_link" which seems
> to be one candidate that is causing additional latency.
> Do you have any clue how this code can be executed properly without causing
> the
 additional latencies?

IMHO something in igb_watchdog_task causes the latency independently from 
actual link. At first glance I would suspect igb_update_stats (called with 
spinlock held) as it seems to do a lot of reads. Maybe this stalls somehow.
Does the latency still occur if you comment that spinlock and call to 
igb_update_stats out?

Best regards,
Alexander

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux