* Kevin Shanahan <kmshanah@xxxxxxxxxxx> wrote: > On Sat, 2009-02-14 at 21:50 +0100, Rafael J. Wysocki wrote: > > This message has been generated automatically as a part of a report > > of regressions introduced between 2.6.27 and 2.6.28. > > > > The following bug entry is on the current list of known regressions > > introduced between 2.6.27 and 2.6.28. Please verify if it still should > > be listed and let me know (either way). > > Yes, this should still be listed. > > I just tested against 2.6.29-rc5 and the problem is as bad as ever > (perhaps worse?) > > --- hermes-old.wumi.org.au ping statistics --- > 900 packets transmitted, 448 received, +317 errors, 50% packet loss, time 899845ms > rtt min/avg/max/mdev = 0.131/420.015/10890.699/1297.022 ms, pipe 11 i looked at the trace you did earlier and which you uploaded to: http://disenchant.net/tmp/bug-12465/trace-1/ Here is one 3 seconds (!) latency: 0) qemu-sy-4237 | | kvm_vcpu_block() { 0) qemu-sy-4237 | | kvm_cpu_has_interrupt() { 0) qemu-sy-4237 | | kvm_apic_has_interrupt() { 0) qemu-sy-4237 | 0.291 us | } 0) qemu-sy-4237 | | kvm_apic_accept_pic_intr() { 0) qemu-sy-4237 | 0.291 us | } 0) qemu-sy-4237 | 1.476 us | } 0) qemu-sy-4237 | | kvm_cpu_has_pending_timer() { 0) qemu-sy-4237 | 0.325 us | } 0) qemu-sy-4237 | | kvm_arch_vcpu_runnable() { 0) qemu-sy-4237 | 0.288 us | } 0) qemu-sy-4237 | | kvm_arch_vcpu_put() { 0) qemu-sy-4237 | 0.415 us | } 0) qemu-sy-4237 | | schedule() { 0) qemu-sy-4237 | | wakeup_preempt_entity() { 0) qemu-sy-4237 | 0.300 us | } ------------------------------------------ 0) qemu-sy-4237 => ksoftir-4 ------------------------------------------ 0) ksoftir-4 | ! 3010470 us | } ------------------------------------------ 0) ksoftir-4 => qemu-sy-4355 ------------------------------------------ 0) qemu-sy-4355 | 1.575 us | } 0) qemu-sy-4355 | 6.520 us | } 0) qemu-sy-4355 | 7.121 us | } 0) qemu-sy-4355 | | __wake_up() { 0) qemu-sy-4355 | | __wake_up_common() { 0) qemu-sy-4355 | | autoremove_wake_function() { 0) qemu-sy-4355 | | default_wake_function() { qemu-sy-4237 has been scheduled away, and the system appeared to have done nothing in the meantime. That's not something that really looks like a scheduler regression - there is nothing the scheduler can do if KVM decides to block a task. It would be nice to enhance this single-CPU trace some more - to more surgically see what is going on. Firstly, absolute timestamps would be nice: echo funcgraph-abstime > trace_options echo funcgraph-proc > trace_options as it's a bit hard to see the global timescale of events. Secondly, not all events are included - in particular i dont really see the points when packets are passed. Would it be possible to add a tracing hypercall so that the guest kernel can inject trace events that can be seen on the native-side trace? Regarding ping latencies really just two things matter: the loopback network device's rx and tx path. We should trace the outgoing sequence number and the incoming sequence number of IP packets, and inject that to the host side. This way we can correlate the delays precisely. Ingo -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html