Re: [PATCH v2 0/2] target/i386/kvm: fix two svm pmu virtualization bugs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ping?

About [PATCH v2 2/2], the bad thing is that the customer will not be able to
notice the issue, that is, the "Broken BIOS detected" in dmesg, immediately.

As a result, the customer VM many panic randomly anytime in the future (once
issue is encountered) if "/proc/sys/kernel/unknown_nmi_panic" is enabled.

Thank you very much!

Dongli Zhang

On 12/19/22 06:45, Dongli Zhang wrote:
> Can I get feedback for this patchset, especially the [PATCH v2 2/2]?
> 
> About the [PATCH v2 2/2], currently the issue impacts the usage of PMUs on AMD
> VM, especially the below case:
> 
> 1. Enable panic on nmi.
> 2. Use perf to monitor the performance of VM. Although without a test, I think
> the nmi watchdog has the same effect.
> 3. A sudden system reset, or a kernel panic (kdump/kexec).
> 4. After reboot, there will be random unknown NMI.
> 5. Unfortunately, the "panic on nmi" may panic the VM randomly at any time.
> 
> Thank you very much!
> 
> Dongli Zhang
> 
> On 12/1/22 16:22, Dongli Zhang wrote:
>> This patchset is to fix two svm pmu virtualization bugs, x86 only.
>>
>> version 1:
>> https://lore.kernel.org/all/20221119122901.2469-1-dongli.zhang@xxxxxxxxxx/
>>
>> 1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization.
>>
>> To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu
>> virtualization. There is still below at the VM linux side ...
>>
>> [    0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver.
>>
>> ... although we expect something like below.
>>
>> [    0.596381] Performance Events: PMU not available due to virtualization, using software events only.
>> [    0.600972] NMI watchdog: Perf NMI watchdog permanently disabled
>>
>> The 1st patch has introduced a new x86 only accel/kvm property
>> "pmu-cap-disabled=true" to disable the pmu virtualization via
>> KVM_PMU_CAP_DISABLE.
>>
>> I considered 'KVM_X86_SET_MSR_FILTER' initially before patchset v1.
>> Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I
>> finally used the latter because it is easier to use.
>>
>>
>> 2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset)
>> at the KVM side may inject random unwanted/unknown NMIs to the VM.
>>
>> The svm pmu registers are not reset during QEMU system_reset.
>>
>> (1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it
>> is running "perf top". The pmu registers are not disabled gracefully.
>>
>> (2). Although the x86_cpu_reset() resets many registers to zero, the
>> kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result,
>> some pmu events are still enabled at the KVM side.
>>
>> (3). The KVM pmc_speculative_in_use() always returns true so that the events
>> will not be reclaimed. The kvm_pmc->perf_event is still active.
>>
>> (4). After the reboot, the VM kernel reports below error:
>>
>> [    0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor.
>> [    0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076)
>>
>> (5). In a worse case, the active kvm_pmc->perf_event is still able to
>> inject unknown NMIs randomly to the VM kernel.
>>
>> [...] Uhhuh. NMI received for unknown reason 30 on CPU 0.
>>
>> The 2nd patch is to fix the issue by resetting AMD pmu registers as well as
>> Intel registers.
>>
>>
>> This patchset does not cover PerfMonV2, until the below patchset is merged
>> into the KVM side.
>>
>> [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support
>> https://lore.kernel.org/all/20221111102645.82001-1-likexu@xxxxxxxxxxx/
>>
>>
>> Dongli Zhang (2):
>>       target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE
>>       target/i386/kvm: get and put AMD pmu registers
>>
>>  accel/kvm/kvm-all.c      |   1 +
>>  include/sysemu/kvm_int.h |   1 +
>>  qemu-options.hx          |   7 +++
>>  target/i386/cpu.h        |   5 ++
>>  target/i386/kvm/kvm.c    | 129 +++++++++++++++++++++++++++++++++++++++++-
>>  5 files changed, 141 insertions(+), 2 deletions(-)
>>
>> Thank you very much!
>>
>> Dongli Zhang
>>
>>



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux