On Wed, 2009-01-21 at 16:18 +0100, Ingo Molnar wrote: > * Avi Kivity <avi@xxxxxxxxxx> wrote: > > It means, a scheduling problem. Can you run the latency tracer (which > > only works with realtime priority), so we can tell if it is (a) kvm > > failing to wake up the vcpu properly or (b) the scheduler delaying the > > vcpu from running. > > Could we please get an ftrace capture of the incident? > > Firstly, it makes sense to simplify the tracing environment as much as > possible: for example single-CPU traces are much easier to interpret. > > Can you reproduce it with just one CPU online? I.e. if you offline all the > other cores via: > > echo 0 > /sys/devices/system/cpu/cpu1/online > > [etc.] > > and keep CPU#0 only, do the latencies still occur? > > If they do still occur, then please do the traces that way. > > [ If they do not occur then switch back on all CPUs - we'll sort out the > traces ;-) ] > > Then please build a function tracer kernel, by enabling: > > CONFIG_FUNCTION_TRACER=y > CONFIG_FUNCTION_GRAPH_TRACER=y > CONFIG_DYNAMIC_FTRACE=y > > Once you boot into such a kernel, you can switch on function tracing via: > > cd /debug/tracing/ > > echo 0 > tracing_enabled > echo function_graph > current_tracer > echo funcgraph-proc > trace_options > > It does not run yet, first find a suitable set of functions to trace. For > example this will be a pretty good starting point for scheduler+KVM > problems: > > echo '' > set_ftrace_filter # clear filter functions > echo '*sched*' >> set_ftrace_filter > echo '*wake*' >> set_ftrace_filter > echo '*kvm*' >> set_ftrace_filter > echo 1 > tracing_enabled # let the tracer go > > You can see your current selection of functions to trace via 'cat > set_ftrace_filter', and you can see all functions via 'cat > available_filter_functions'. > > You can also trace all functions via: > > echo '*' > set_ftrace_filter > > Tracer output can be captured from the 'trace' file. It should look like > this: > > 15) cc1-28106 | 0.263 us | page_evictable(); > 15) cc1-28106 | | lru_cache_add_lru() { > 15) cc1-28106 | 0.252 us | __lru_cache_add(); > 15) cc1-28106 | 0.738 us | } > 15) cc1-28106 | + 74.026 us | } > 15) cc1-28106 | | up_read() { > 15) cc1-28106 | 0.257 us | _spin_lock_irqsave(); > 15) cc1-28106 | 0.253 us | _spin_unlock_irqrestore(); > 15) cc1-28106 | 1.329 us | } > > To capture a continuous stream of all trace data you can do: > > cat trace_pipe > /tmp/trace.txt > > (this will also drain the trace ringbuffers.) > > Note that this can be quite expensive if there are a lot of functions that > are traced - so it makes sense to trim down the set of traced functions to > only the interesting ones. Which are the interesting ones can be > determined from looking at the traces. You should see your KVM threads > getting active every second as the ping happens. > > If you get lost events you can increase the trace buffer size via the > buffer_size_kb control - the default is around 1.4 MB. > > Let me know if any of these steps is causing problems or if interpreting > the traces is difficult. Just carrying out the steps was okay, but I don't really know what I'm looking at. I've uploaded the trace here (about 10 seconds worth, I think): http://disenchant.net/tmp/bug-12465/trace-1/ The guest being pinged is process 4353: kmshanah@flexo:~$ pstree -p 4353 qemu-system-x86(4353)─┬─{qemu-system-x86}(4354) ├─{qemu-system-x86}(4355) └─{qemu-system-x86}(4772) I guess the larger overhead/duration values are what we are looking for, e.g.: kmshanah@flexo:~$ bzgrep -E '[[:digit:]]{6,}' trace.txt.bz2 0) ksoftir-4 | ! 3010470 us | } 0) qemu-sy-4354 | ! 250406.2 us | } 0) qemu-sy-4354 | ! 250407.0 us | } 0) qemu-sy-4354 | ! 362946.3 us | } 0) qemu-sy-4354 | ! 362947.0 us | } 0) qemu-sy-4177 | ! 780480.3 us | } 0) qemu-sy-4354 | ! 117685.7 us | } 0) qemu-sy-4354 | ! 117686.5 us | } That ksoftirqd value is a bit strange (> 3 seconds, or is the formatting wrong?). I guess I still need some guidance to know what I'm looking at with this trace and/or what to do next. Cheers, Kevin. -- To unsubscribe from this list: send the line "unsubscribe kernel-testers" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html