From: Koehrer Mathias > Sent: 13 October 2016 11:57 .. > > The time between my trace points 700 and 701 is about 30us, the time between my > > trace points 600 and 601 is even 37us!! > > The code in between is > > tsyncrxctl = rd32(E1000_TSYNCRXCTL); resp. > > lvmmc = rd32(E1000_LVMMC); > > > > In both cases this is a single read from a register. > > I have no idea why this single read could take that much time! > > Is it possible that the igb hardware is in a state that delays the read access and this is > > why the whole I/O system might be delayed? > > > > To have a proper comparison, I did the same with kernel 3.18.27-rt27. > Also here, I instrumented the igb driver to get traces for the rd32 calls. > However, here everything is generally much faster! > In the idle system the maximum I got for a read was about 6us, most times it was 1-2us. 1-2us is probably about right, PCIe is high throughput high latency. You should see the latencies we get talking to fpga! > On the 4.8 kernel this is always much slower (see above). > My question is now: Is there any kernel config option that has been introduced in the meantime > that may lead to this effect and which is not set in my 4.8 config? Have a look at the generated code for rd32(). Someone might have added a load of synchronisation instructions to it. On x86 I don't think it needs any. It is also possible for other PCIe accesses to slow things down (which might be why you see 6us). I presume you are doing these comparisons on the same hardware? Obscure bus topologies could slow things down. David -- To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html