Hi
I'm getting a hard lockup in wait_lapic_expire(). I'm not certain why,
and it didn't seem to reproduce on a second setup. However I found a
suspect in the code for TSC deadline timers. Maybe someone is interested
in my analysis.
If a guest has a TSC deadline timer set, it's not re-done when the TSC
is adjusted, and will fire at the wrong time. The hrtimer used for
emulation is not being recalculated. If the TSC was set _backwards_, I
think it could trigger a lockup in wait_lapic_expire(). This function is
a busy-loop optimization, which could be tricked into busy-looping for
too long. The expected busy-wait is `lapic_timer_advance_ns`, which
defaults to 0 (I haven't changed it).
I'm not confident about submitting a patch to redo the timer for full
correctness. There's too many different places where the TSC is adjusted
:(. I could add a simple safety check to fix my issue, but I feel like
there should be a fix for whatever the root cause is.
My lockup issue seems to happen with about 50% probability when I'm
using virt-manager to restore a snapshotted VM. This is on a NUC 5CPYH
("Braswell" cpu), with kernel 4.4.9-300.fc23.x86_64, guest is Debian 8.
The second setup I tried (and failed to reproduce it on) was also Fedora
23, with a Debian 8 guest, but on a Thinkpad X201. (CPU: an old Core i5).
Fedora tracker: https://bugzilla.redhat.com/show_bug.cgi?id=1337667
Backtrace:
[ 534.539016] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
[ 534.539025] Modules linked in:
[ 534.539030] bnep bluetooth cpufreq_stats fuse nf_conntrack_netbios_ns
nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack
ip_set nfnetlink ebtable_nat ebtable_filter ebtable_broute bridge ebtables
ip6table_security ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_mangle ip6table_raw ip6table_filter ip6_tables iptable_security
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_mangle iptable_raw vfat fat btrfs xor raid6_pq intel_rapl coretemp
kvm_intel arc4 kvm irqbypass crct10dif_pclmul iwlmvm crc32_pclmul mac80211
snd_hda_codec_hdmi crc32c_intel iTCO_wdt iTCO_vendor_support
snd_hda_codec_realtek snd_intel_sst_acpi snd_intel_sst_core snd_soc_rt5670
snd_soc_sst_mfld_platform iwlwifi nxp_nci_i2c snd_hda_codec_generic nxp_nci
[ 534.539086] snd_soc_rl6231 snd_hda_intel snd_soc_core snd_hda_codec nci
cfg80211 i2c_i801 mei_txe mei lpc_ich ir_mce_kbd_decoder ir_sanyo_decoder
ir_sharp_decoder ir_xmp_decoder ir_rc6_decoder ir_rc5_decoder
ir_sony_decoder
ir_lirc_codec lirc_dev ir_nec_decoder ir_jvc_decoder snd_hda_core snd_hwdep
shpchp snd_compress rc_rc6_mce snd_pcm_dmaengine snd_seq nfc ite_cir
ac97_bus
snd_seq_device rc_core snd_pcm i2c_designware_platform snd_timer
i2c_designware_core snd tpm_tis iosf_mbi soundcore rfkill_gpio rfkill tpm
spi_pxa2xx_platform soc_button_array nfsd auth_rpcgss nfs_acl lockd grace
sunrpc i915 i2c_algo_bit drm_kms_helper 8021q garp stp llc mrp drm r8169
serio_raw mii sdhci_acpi sdhci mmc_core video i2c_hid fjes
[ 534.539151] CPU: 1 PID: 3071 Comm: qemu-system-x86 Not tainted
4.4.9-300.fc23.x86_64 #1
[ 534.539154] Hardware name: /NUC5CPYB, BIOS
PYBSWCEL.86A.0053.2016.0401.1154 04/01/2016
[ 534.539157] 0000000000000086 00000000b08a09bb ffff88017fd05b90
ffffffff813b62ee
[ 534.539161] 0000000000000000 0000000000000000 ffff88017fd05ba8
ffffffff8115aebd
[ 534.539165] ffff88017a588000 ffff88017fd05be0 ffffffff811a32ec
0000000000000001
[ 534.539170] Call Trace:
[ 534.539173] <NMI> [<ffffffff813b62ee>] dump_stack+0x63/0x85
[ 534.539189] [<ffffffff8115aebd>] watchdog_overflow_callback+0xbd/0xd0
[ 534.539194] [<ffffffff811a32ec>] __perf_event_overflow+0x8c/0x1d0
[ 534.539198] [<ffffffff811a3ec4>] perf_event_overflow+0x14/0x20
[ 534.539204] [<ffffffff81035fe1>] intel_pmu_handle_irq+0x1e1/0x4a0
[ 534.539210] [<ffffffff8102c758>] perf_event_nmi_handler+0x28/0x50
[ 534.539215] [<ffffffff8101b0c9>] nmi_handle+0x69/0x120
[ 534.539218] [<ffffffff8101b63d>] default_do_nmi+0xad/0xf0
[ 534.539221] [<ffffffff8101b762>] do_nmi+0xe2/0x130
[ 534.539227] [<ffffffff817a3471>] end_repeat_nmi+0x1a/0x1e
[ 534.539238] [<ffffffffa06fa3d0>] ? vmx_invpcid_supported+0x30/0x30
[kvm_intel]
[ 534.539243] [<ffffffff813c32a5>] ? delay_tsc+0x25/0x60
[ 534.539246] [<ffffffff813c32a5>] ? delay_tsc+0x25/0x60
[ 534.539250] [<ffffffff813c32a5>] ? delay_tsc+0x25/0x60
[ 534.539252] <<EOE>> [<ffffffff813c31df>] __delay+0xf/0x20
[ 534.539287] [<ffffffffa07b48db>] wait_lapic_expire+0xcb/0x150 [kvm]
[ 534.539311] [<ffffffffa0797891>]
kvm_arch_vcpu_ioctl_run+0x371/0x14e0 [kvm]
[ 534.539316] [<ffffffff81120dd0>] ? futex_wake+0x90/0x180
[ 534.539338] [<ffffffffa078005d>] kvm_vcpu_ioctl+0x33d/0x620 [kvm]
[ 534.539344] [<ffffffff810b3951>] ? do_sigtimedwait+0xd1/0x200
[ 534.539350] [<ffffffff81241648>] do_vfs_ioctl+0x298/0x480
[ 534.539355] [<ffffffff81338393>] ? security_file_ioctl+0x43/0x60
[ 534.539359] [<ffffffff812418a9>] SyS_ioctl+0x79/0x90
[ 534.539364] [<ffffffff817a0f2e>] entry_SYSCALL_64_fastpath+0x12/0x71
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html