Eduardo Habkost <ehabkost@xxxxxxxxxx> writes: > On Mon, Aug 23, 2021 at 04:30:28PM +0200, Vitaly Kuznetsov wrote: >> KASAN reports the following issue: >> >> BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm] >> Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798 >> >> CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G X --------- --- >> Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020 >> Call Trace: >> dump_stack+0xa5/0xe6 >> print_address_description.constprop.0+0x18/0x130 >> ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] >> __kasan_report.cold+0x7f/0x114 >> ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] >> kasan_report+0x38/0x50 >> kasan_check_range+0xf5/0x1d0 >> kvm_make_vcpus_request_mask+0x174/0x440 [kvm] >> kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm] >> ? kvm_arch_exit+0x110/0x110 [kvm] >> ? sched_clock+0x5/0x10 >> ioapic_write_indirect+0x59f/0x9e0 [kvm] >> ? static_obj+0xc0/0xc0 >> ? __lock_acquired+0x1d2/0x8c0 >> ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm] >> >> The problem appears to be that 'vcpu_bitmap' is allocated as a single long >> on stack and it should really be KVM_MAX_VCPUS long. We also seem to clear >> the lower 16 bits of it with bitmap_zero() for no particular reason (my >> guess would be that 'bitmap' and 'vcpu_bitmap' variables in >> kvm_bitmap_or_dest_vcpus() caused the confusion: while the later is indeed >> 16-bit long, the later should accommodate all possible vCPUs). >> >> Fixes: 7ee30bc132c6 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs") >> Fixes: 9a2ae9f6b6bb ("KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap") >> Reported-by: Dr. David Alan Gilbert <dgilbert@xxxxxxxxxx> >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> >> --- >> arch/x86/kvm/ioapic.c | 10 +++++----- >> 1 file changed, 5 insertions(+), 5 deletions(-) >> >> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c >> index ff005fe738a4..92cd4b02e9ba 100644 >> --- a/arch/x86/kvm/ioapic.c >> +++ b/arch/x86/kvm/ioapic.c >> @@ -319,7 +319,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) >> unsigned index; >> bool mask_before, mask_after; >> union kvm_ioapic_redirect_entry *e; >> - unsigned long vcpu_bitmap; >> + unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]; > > Is there a way to avoid this KVM_MAX_VCPUS-sized variable on the > stack? This might hit us back when we increase KVM_MAX_VCPUS to > a few thousand VCPUs (I was planning to submit a patch for that > soon). What's the short- or mid-term target? Note, we're allocating KVM_MAX_VCPUS bits (not bytes!) here, this means that for e.g. 2048 vCPUs we need 256 bytes of the stack only. In case the target much higher than that, we will need to either switch to dynamic allocation or e.g. use pre-allocated per-CPU variables and make this a preempt-disabled region. I, however, would like to understand if the problem with allocating this from stack is real or not first. > > >> int old_remote_irr, old_delivery_status, old_dest_id, old_dest_mode; >> >> switch (ioapic->ioregsel) { >> @@ -384,9 +384,9 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) >> irq.shorthand = APIC_DEST_NOSHORT; >> irq.dest_id = e->fields.dest_id; >> irq.msi_redir_hint = false; >> - bitmap_zero(&vcpu_bitmap, 16); >> + bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS); >> kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq, >> - &vcpu_bitmap); >> + vcpu_bitmap); >> if (old_dest_mode != e->fields.dest_mode || >> old_dest_id != e->fields.dest_id) { >> /* >> @@ -399,10 +399,10 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) >> kvm_lapic_irq_dest_mode( >> !!e->fields.dest_mode); >> kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq, >> - &vcpu_bitmap); >> + vcpu_bitmap); >> } >> kvm_make_scan_ioapic_request_mask(ioapic->kvm, >> - &vcpu_bitmap); >> + vcpu_bitmap); >> } else { >> kvm_make_scan_ioapic_request(ioapic->kvm); >> } >> -- >> 2.31.1 >> -- Vitaly