Re: [PATCH 1/1 V4] qemu-kvm: fix improper nmi emulation

Jan Kiszka <jan.kiszka@xxxxxx> · Fri, 14 Oct 2011 10:31:18 +0200

On 2011-10-14 09:43, Lai Jiangshan wrote:
> On 10/14/2011 02:49 PM, Jan Kiszka wrote:
>> On 2011-10-14 08:36, Lai Jiangshan wrote:
>>> On 10/14/2011 01:53 PM, Jan Kiszka wrote:
>>>> On 2011-10-14 02:53, Lai Jiangshan wrote:
>>>>>
>>>>>>
>>>>>> As explained in some other mail, we could then emulate the missing
>>>>>> kernel feature by reading out the current in-kernel APIC state, testing
>>>>>> if LINT1 is unmasked, and then delivering the NMI directly.
>>>>>>
>>>>>
>>>>> Only the thread of the VCPU can safely get the in-kernel LAPIC states,
>>>>> so this approach will cause some troubles.
>>>>
>>>> run_on_cpu() can help.
>>>>
>>>> Jan
>>>>
>>>
>>> Ah, I forgot it, Thanks.
>>>
>>> From: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
>>>
>>> Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
>>> button event happens. This doesn't properly emulate real hardware on
>>> which NMI button event triggers LINT1. Because of this, NMI is sent to
>>> the processor even when LINT1 is maskied in LVT. For example, this
>>> causes the problem that kdump initiated by NMI sometimes doesn't work
>>> on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
>>>
>>> With this patch, inject-nmi request is handled as follows.
>>>
>>> - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI
>>>   interrupt.
>>> - When in-kernel irqchip is enabled, get the in-kernel LAPIC states
>>>   and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then
>>>   delivering the NMI directly. (Suggested by Jan Kiszka)
>>>
>>> Changed from old version:
>>>   re-implement it by the Jan's suggestion.
>>>
>>> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
>>> Reported-by: Kenji Kaneshige <kaneshige.kenji@xxxxxxxxxxxxxx>
>>> ---
>>>  hw/apic.c |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  hw/apic.h |    1 +
>>>  monitor.c |    6 +++++-
>>>  3 files changed, 54 insertions(+), 1 deletions(-)
>>> diff --git a/hw/apic.c b/hw/apic.c
>>> index 69d6ac5..9a40129 100644
>>> --- a/hw/apic.c
>>> +++ b/hw/apic.c
>>> @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level)
>>>      }
>>>  }
>>>  
>>> +#ifdef KVM_CAP_IRQCHIP
>>
>> Again, this is always defined on x86 thus pointless to test.
>>
>>> +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id);
>>> +
>>> +struct kvm_get_remote_lapic_params {
>>> +    CPUState *env;
>>> +    struct kvm_lapic_state klapic;
>>> +};
>>> +
>>> +static void kvm_get_remote_lapic(void *p)
>>> +{
>>> +    struct kvm_get_remote_lapic_params *params = p;
>>> +
>>> +    kvm_get_lapic(params->env, &params->klapic);
>>
>> When you already interrupted that vcpu, why not inject from here? Avoids
>> one further ping-pong round.
> 
> get_remote_lapic and inject nmi are two different things,
> so I don't inject nmi from here. I didn't notice this ping-pond overhead.
> Thank you.

Actually, it is not performance-critical. But there is a race between
obtaining the APIC state and testing for the NMI injection path. So it's
better to define an on-vcpu LINT1 NMI injection service.

> 
>>
>>> +}
>>> +
>>> +void apic_deliver_nmi(DeviceState *d)
>>> +{
>>> +    APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
>>> +
>>> +    if (kvm_irqchip_in_kernel()) {
>>> +        struct kvm_get_remote_lapic_params p = {.env = s->cpu_env,};
>>> +        uint32_t lvt;
>>> +
>>> +        run_on_cpu(s->cpu_env, kvm_get_remote_lapic, &p);
>>> +        lvt = kapic_reg(&p.klapic, 0x32 + APIC_LVT_LINT1);
>>> +
>>> +        if (lvt & APIC_LVT_MASKED) {
>>> +            return;
>>> +        }
>>> +
>>> +        if (((lvt >> 8) & 7) != APIC_DM_NMI) {
>>> +            return;
>>> +        }
>>> +
>>> +        cpu_interrupt(s->cpu_env, CPU_INTERRUPT_NMI);
>>
>> Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if
>> this workaround is needed? Oh, your latest kernel patch is missing this
>> again - requires fixing as well.
>>
> 
> 
> Kernel site patch is dropped with this v4 patch.
> 
> Did you mean you want KVM_CAP_SET_LINT1 + KVM_SET_LINT1 patches?
> I have made them.

OK, so this is going to be applied on top? Then I take this remark back.

Jan

Attachment:
signature.asc

Description: OpenPGP digital signature