On 02/09/2012 01:27 PM, Amit Shah wrote:
On (Tue) 07 Feb 2012 [19:05:42], Marcelo Tosatti wrote:
Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.
Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.
Fixes suspend-to-disk with kvmclock.
There are stalls after resume, see trace below.
./x86_64-softmmu/qemu-system-x86_64 -kernel ~/src/linux/arch/x86/boot/bzImage -append 'root=/dev/vda1 console=tty0 console=ttyS0 no_console_suspend' -drive file=/guests/f14-suspend.qcow2,if=none,id=dr0 -device virtio-blk-pci,drive=dr0 -net nic,model=virtio -net user -serial stdio -enable-kvm -m 512 -cpu host -smp 4
<The following is on resume:>
Disabling non-boot CPUs ...
CPU 1 is now offline
CPU 2 is now offline
CPU 3 is now offline
lockdep: fixing up alternatives.
kvm-clock: cpu 0, msr 0:1f1d19c1, primary cpu clock, resume
PM: Restoring platform NVS memory
Enabling non-boot CPUs ...
lockdep: fixing up alternatives.
Booting Node 0 Processor 1 APIC 0x1
Calibrating delay loop (skipped) already calibrated this CPU
INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=314192 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
<IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
[<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
[<ffffffff81031208>] update_process_times+0x3f/0x75
[<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
[<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
[<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
[<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
[<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
<EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
[<ffffffff8130840c>] native_cpu_up+0x108/0x127
[<ffffffff81308f2b>] _cpu_up+0x92/0xfc
[<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
[<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
[<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
[<ffffffff81059e6d>] hibernate+0x97/0x196
[<ffffffff81057eb0>] state_store+0x5c/0x106
[<ffffffff811af49b>] kobj_attr_store+0x17/0x19
[<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
[<ffffffff810eb5f6>] vfs_write+0xab/0xd2
[<ffffffff810ec955>] ? fget_light+0x3a/0xa1
[<ffffffff810eb7ad>] sys_write+0x4d/0x74
[<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=314193 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
<IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
[<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
[<ffffffff81031208>] update_process_times+0x3f/0x75
[<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
[<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
[<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
[<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
[<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
<EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
[<ffffffff8130840c>] native_cpu_up+0x108/0x127
[<ffffffff81308f2b>] _cpu_up+0x92/0xfc
[<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
[<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
[<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
[<ffffffff81059e6d>] hibernate+0x97/0x196
[<ffffffff81057eb0>] state_store+0x5c/0x106
[<ffffffff811af49b>] kobj_attr_store+0x17/0x19
[<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
[<ffffffff810eb5f6>] vfs_write+0xab/0xd2
[<ffffffff810ec955>] ? fget_light+0x3a/0xa1
[<ffffffff810eb7ad>] sys_write+0x4d/0x74
[<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
INFO: rcu_sched detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840232 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
<IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
[<ffffffff8108bef5>] rcu_check_callbacks+0xab/0x108
[<ffffffff81031208>] update_process_times+0x3f/0x75
[<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
[<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
[<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
[<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
[<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
<EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
[<ffffffff8130840c>] native_cpu_up+0x108/0x127
[<ffffffff81308f2b>] _cpu_up+0x92/0xfc
[<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
[<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
[<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
[<ffffffff81059e6d>] hibernate+0x97/0x196
[<ffffffff81057eb0>] state_store+0x5c/0x106
[<ffffffff811af49b>] kobj_attr_store+0x17/0x19
[<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
[<ffffffff810eb5f6>] vfs_write+0xab/0xd2
[<ffffffff810ec955>] ? fget_light+0x3a/0xa1
[<ffffffff810eb7ad>] sys_write+0x4d/0x74
[<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
INFO: rcu_preempt detected stalls on CPUs/tasks: { 1} (detected by 0, t=1713579840233 jiffies)
Pid: 662, comm: bash Not tainted 3.3.0-rc2+ #293
Call Trace:
<IRQ> [<ffffffff8108b6c9>] __rcu_pending+0x268/0x3c5
[<ffffffff8108bf33>] rcu_check_callbacks+0xe9/0x108
[<ffffffff81031208>] update_process_times+0x3f/0x75
[<ffffffff810649b5>] tick_sched_timer+0x6d/0x8c
[<ffffffff81042e68>] __run_hrtimer+0xc9/0x15c
[<ffffffff81064948>] ? tick_nohz_handler+0xd5/0xd5
[<ffffffff8104359b>] hrtimer_interrupt+0xea/0x1b4
[<ffffffff81316192>] smp_apic_timer_interrupt+0x76/0x89
[<ffffffff81315833>] apic_timer_interrupt+0x73/0x80
<EOI> [<ffffffff8130b9ec>] ? arch_local_irq_restore+0x6/0xd
[<ffffffff8130840c>] native_cpu_up+0x108/0x127
[<ffffffff81308f2b>] _cpu_up+0x92/0xfc
[<ffffffff812facf7>] enable_nonboot_cpus+0x4d/0xb2
[<ffffffff81059836>] hibernation_snapshot+0x1f3/0x2a0
[<ffffffff81044406>] ? cleanup_srcu_struct+0x52/0x58
[<ffffffff81059e6d>] hibernate+0x97/0x196
[<ffffffff81057eb0>] state_store+0x5c/0x106
[<ffffffff811af49b>] kobj_attr_store+0x17/0x19
[<ffffffff811368fa>] sysfs_write_file+0x10e/0x14a
[<ffffffff810eb5f6>] vfs_write+0xab/0xd2
[<ffffffff810ec955>] ? fget_light+0x3a/0xa1
[<ffffffff810eb7ad>] sys_write+0x4d/0x74
[<ffffffff81314ce9>] system_call_fastpath+0x16/0x1b
Amit
Stalls are probably caused by uninitialized percpu hv_clock, with
following patch I don't see stalls. Although I might be just lucky.
http://git.kernel.org/?p=virt/kvm/kvm.git;a=commit;h=e2971ac7e1d186af059e088d305496c5cb47d487
However there is/are a warning/s on suspend path and with following patch:
diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 66c74f4..c8e7587 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -117,6 +117,7 @@
static void native_smp_send_reschedule(int cpu)
{
if (unlikely(cpu_is_offline(cpu))) {
+ printk ("curr_cpu: %d, for cpu: %d\n", smp_processor_id(), cpu);
WARN_ON(1);
return;
}
[ 3.101919] Unregister pv shared memory for cpu 1
[ 3.207321] CPU 1 is now offline
[ 3.210177] curr_cpu: 0, for cpu: 1
[ 3.211172] ------------[ cut here ]------------
[ 3.211172] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[ 3.211172] Hardware name: Bochs
[ 3.211172] Modules linked in: floppy virtio_blk
[ 3.211172] Pid: 264, comm: sh Not tainted 3.3.0-rc1+ #42
[ 3.211172] Call Trace:
[ 3.211172] <IRQ> [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[ 3.211172] [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[ 3.211172] [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[ 3.211172] [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[ 3.211172] [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[ 3.211172] [<ffffffff81052db8>] update_process_times+0x65/0x76
[ 3.211172] [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[ 3.211172] [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[ 3.211172] [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[ 3.211172] [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[ 3.211172] [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[ 3.211172] [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[ 3.211172] <EOI> [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[ 3.211172] [<ffffffff8111ebe3>] ? cmpxchg_double_slab+0x6/0xef
[ 3.211172] [<ffffffff814c9afc>] __slab_free+0xe6/0x1b7
[ 3.211172] [<ffffffff8111eed2>] kmem_cache_free+0x72/0xac
[ 3.211172] [<ffffffff814ba14f>] ? cpuid4_cache_sysfs_exit+0x94/0xad
[ 3.211172] [<ffffffff81186d23>] release_sysfs_dirent+0x9b/0xb9
[ 3.211172] [<ffffffff81186d59>] __sysfs_put+0x18/0x1a
[ 3.211172] [<ffffffff81187257>] sysfs_addrm_finish+0x8a/0xa2
[ 3.211172] [<ffffffff81185d2b>] sysfs_hash_and_remove+0x5c/0x71
[ 3.211172] [<ffffffff8118674d>] sysfs_remove_file+0x34/0x3b
[ 3.211172] [<ffffffff812ee9da>] device_remove_file+0x17/0x19
[ 3.211172] [<ffffffff814bcaff>] mce_cpu_callback+0xa3/0x186
[ 3.211172] [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[ 3.211172] [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[ 3.211172] [<ffffffff81047758>] __cpu_notify+0x20/0x32
[ 3.211172] [<ffffffff81047863>] cpu_notify_nofail+0x13/0x1b
[ 3.211172] [<ffffffff814ae6b3>] _cpu_down+0x13a/0x21f
[ 3.211172] [<ffffffff81047915>] disable_nonboot_cpus+0x66/0x11e
[ 3.211172] [<ffffffff8107bff1>] hibernation_restore+0x78/0x106
[ 3.211172] [<ffffffff8107c25f>] software_resume+0x1e0/0x249
[ 3.211172] [<ffffffff8107c353>] resume_store+0x8b/0x97
[ 3.211172] [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[ 3.211172] [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[ 3.211172] [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[ 3.211172] [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[ 3.211172] [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[ 3.211172] [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[ 3.211172] ---[ end trace 5aa7abdf47c70330 ]---
[ 3.276748] Unregister pv shared memory for cpu 2
[ 3.280771] CPU 2 is now offline
[ 3.285070] Unregister pv shared memory for cpu 3
[ 3.288206] CPU 3 is now offline
[ 31.290110] kvm-clock: cpu 0, msr 0:1fc13341, primary cpu clock, resume
[ 31.290110] Enabling non-boot CPUs ...
[ 31.291319] Booting Node 0 Processor 1 APIC 0x1
[ 31.293720] smpboot cpu 1: start_ip = 96000
[ 31.172141] Calibrating delay loop (skipped) already calibrated this CPUcurr_cpu: 0, for cpu: 2
[ 31.330037] ------------[ cut here ]------------
[ 31.330094] WARNING: at arch/x86/kernel/smp.c:121 native_smp_send_reschedule+0x40/0x71()
[ 31.330094] Hardware name: Bochs
[ 31.330094] Modules linked in: lockd fcoe libfcoe 8021q libfc scsi_transport_fc scsi_tgt garp stp llc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables e1000 pcspkr uinput i2c_piix4 i2c_core ppdev parport_pc parport
microcode sunrpc floppy virtio_blk [last unloaded: scsi_wait_scan]
[ 31.330094] Pid: 1016, comm: bash Tainted: G W 3.3.0-rc1+ #42
[ 31.330094] Call Trace:
[ 31.330094] <IRQ> [<ffffffff81045ecd>] warn_slowpath_common+0x83/0x9b
[ 31.330094] [<ffffffff81045eff>] warn_slowpath_null+0x1a/0x1c
[ 31.330094] [<ffffffff81026dba>] native_smp_send_reschedule+0x40/0x71
[ 31.330094] [<ffffffff810749d6>] trigger_load_balance+0x28e/0x2ba
[ 31.330094] [<ffffffff8106e7ec>] scheduler_tick+0xe9/0xf2
[ 31.330094] [<ffffffff81052db8>] update_process_times+0x65/0x76
[ 31.330094] [<ffffffff81088219>] tick_sched_timer+0x75/0x9f
[ 31.330094] [<ffffffff81064056>] __run_hrtimer+0xb1/0x155
[ 31.330094] [<ffffffff810881a4>] ? tick_nohz_handler+0xdd/0xdd
[ 31.330094] [<ffffffff8106476c>] hrtimer_interrupt+0xe0/0x19e
[ 31.330094] [<ffffffff814d847c>] smp_apic_timer_interrupt+0x77/0x8a
[ 31.330094] [<ffffffff814d7133>] apic_timer_interrupt+0x73/0x80
[ 31.330094] <EOI> [<ffffffff814cffe1>] ? retint_restore_args+0x13/0x13
[ 31.330094] [<ffffffff8108d42a>] ? arch_local_irq_restore+0x6/0xd
[ 31.330094] [<ffffffff814cfd88>] _raw_spin_unlock_irqrestore+0x3d/0x3f
[ 31.330094] [<ffffffff814b1bd3>] workqueue_cpu_callback+0x203/0x214
[ 31.330094] [<ffffffff814d31c0>] notifier_call_chain+0x37/0x63
[ 31.330094] [<ffffffff814c6b68>] ? arch_local_irq_restore+0x6/0xd
[ 31.330094] [<ffffffff810655b6>] __raw_notifier_call_chain+0xe/0x10
[ 31.330094] [<ffffffff81047758>] __cpu_notify+0x20/0x32
[ 31.330094] [<ffffffff814bfb99>] _cpu_up+0xcd/0xfc
[ 31.330094] [<ffffffff814ae88b>] enable_nonboot_cpus+0x57/0xc4
[ 31.330094] [<ffffffff8107becc>] hibernation_snapshot+0x1f3/0x2a0
[ 31.330094] [<ffffffff8107c503>] hibernate+0x97/0x198
[ 31.330094] [<ffffffff8107a613>] state_store+0x5c/0x106
[ 31.330094] [<ffffffff8122f23b>] kobj_attr_store+0x17/0x19
[ 31.330094] [<ffffffff81186183>] sysfs_write_file+0x101/0x13d
[ 31.330094] [<ffffffff8112f2e3>] vfs_write+0xac/0xf3
[ 31.330094] [<ffffffff810caf1d>] ? trace_hardirqs_on_caller+0xf/0x24
[ 31.330094] [<ffffffff8112f4d2>] sys_write+0x4a/0x6e
[ 31.330094] [<ffffffff814d6629>] system_call_fastpath+0x16/0x1b
[ 31.330094] ---[ end trace bbd3f0319d3e2c44 ]---
[ 31.402829] KVM setup async PF for cpu 1
[ 31.402829] kvm-stealtime: cpu 1, msr 1fc8df00
[ 31.656224] NMI watchdog disabled (cpu1): hardware events not enabled
[ 31.659240] CPU1 is up
[ 31.660402] Booting Node 0 Processor 2 APIC 0x2
[ 31.661522] smpboot cpu 2: start_ip = 96000
[ 31.278595] Calibrating delay loop (skipped) already calibrated this CPU
[ 31.697981] NMI watchdog disabled (cpu2): hardware events not enabled
[ 31.698047] KVM setup async PF for cpu 2
[ 31.698051] kvm-stealtime: cpu 2, msr 1fd0df00
[ 31.706326] CPU2 is up
however I think they are not related to subj patch.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html