Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy

James Morse <james.morse@xxxxxxx> · Thu, 23 Apr 2020 15:34:53 +0100

Hi guys,

On 23/04/2020 13:03, Marc Zyngier wrote:
> On 2020-04-23 12:35, James Morse wrote:
>> On 22/04/2020 17:18, Marc Zyngier wrote:
>>> From: Zenghui Yu <yuzenghui@xxxxxxxxxx>
>>>
>>> It's likely that the vcpu fails to handle all virtual interrupts if
>>> userspace decides to destroy it, leaving the pending ones stay in the
>>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
>>> be eventually leaked because of an extra refcount increment in
>>> vgic_queue_irq_unlock().
>>
>>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>>> index a963b9d766b73..53ec9b9d9bc43 100644
>>> --- a/virt/kvm/arm/vgic/vgic-init.c
>>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>>  {
>>>      struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>
>>> +    /*
>>> +     * Retire all pending LPIs on this vcpu anyway as we're
>>> +     * going to destroy it.
>>> +     */
>>
>> Looking at the other caller, do we need something like:
>> |    if (vgic_cpu->lpis_enabled)
>>
>> ?
> 
> Huh... On its own, this call is absolutely harmless even if you
> don't have LPIs. But see below.
> 
>>
>>> +    vgic_flush_pending_lpis(vcpu);
>>> +
>>
>> Otherwise, I get this on a gic-v2 machine!:
>> [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0
>> [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task
>> qemu-system-aar/542
>> [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted
>> 5.7.0-rc2-00006-g4fb0f7bb0e27 #2
>> [ 1742.211780] Hardware name: ARM LTD ARM Juno Development
>> Platform/ARM Juno Development
>> Platform, BIOS EDK II Jul 30 2018
>> [ 1742.222596] Call trace:
>> [ 1742.225059]  dump_backtrace+0x0/0x328
>> [ 1742.228738]  show_stack+0x18/0x28
>> [ 1742.232071]  dump_stack+0x134/0x1b0
>> [ 1742.235578]  print_address_description.isra.0+0x6c/0x350
>> [ 1742.240910]  __kasan_report+0x10c/0x180
>> [ 1742.244763]  kasan_report+0x4c/0x68
>> [ 1742.248268]  __asan_report_load8_noabort+0x30/0x48
>> [ 1742.253081]  vgic_flush_pending_lpis+0x250/0x2c0
>> [ 1742.257718]  __kvm_vgic_destroy+0x1cc/0x478
>> [ 1742.261919]  kvm_vgic_destroy+0x30/0x48
>> [ 1742.265773]  kvm_arch_destroy_vm+0x20/0x128
>> [ 1742.269976]  kvm_put_kvm+0x3e0/0x8d0
>> [ 1742.273567]  kvm_vm_release+0x3c/0x60
>> [ 1742.277248]  __fput+0x218/0x630
>> [ 1742.280406]  ____fput+0x10/0x20
>> [ 1742.283565]  task_work_run+0xd8/0x1f0
>> [ 1742.287245]  do_exit+0x87c/0x2640
>> [ 1742.290575]  do_group_exit+0xd0/0x258
>> [ 1742.294254]  __arm64_sys_exit_group+0x3c/0x48
>> [ 1742.298631]  el0_svc_common.constprop.0+0x10c/0x348
>> [ 1742.303529]  do_el0_svc+0x48/0xd0
>> [ 1742.306861]  el0_sync_handler+0x11c/0x1b8
>> [ 1742.310888]  el0_sync+0x158/0x180

>> [ 1742.348215] page dumped because: kasan: bad access detected

> I think this is slightly more concerning. The issue is that we have
> started freeing parts of the interrupt state already (we free the
> SPIs early in kvm_vgic_dist_destroy()).

(I took this to be some wild pointer access. Previously for use-after-free I've seen it
print where it was allocated and where it was freed).

> If a SPI was pending or active at this stage (i.e. present in the
> ap_list), we are going to iterate over memory that has been freed
> already. This is bad, and this can happen on GICv3 as well.

> I think this should solve it, but I need to test it on a GICv2 system:
> 
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 53ec9b9d9bc43..30dbec9fe0b4a 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm)
> 
>      vgic_debug_destroy(kvm);
> 
> -    kvm_vgic_dist_destroy(kvm);
> -
>      kvm_for_each_vcpu(i, vcpu, kvm)
>          kvm_vgic_vcpu_destroy(vcpu);
> +
> +    kvm_vgic_dist_destroy(kvm);
>  }
> >  void kvm_vgic_destroy(struct kvm *kvm)

This works for me on Juno.

Thanks,

James