On Thu, Dec 09, 2021, Maxim Levitsky wrote: > Also got this while trying a VM with passed through device: > > [mlevitsk@amdlaptop ~]$[ 34.926140] usb 5-3: reset full-speed USB device number 3 using xhci_hcd > [ 42.583661] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck. > [ 363.562173] VFIO - User Level meta-driver version: 0.3 > [ 365.160357] vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x1e@0x154 > [ 384.138110] BUG: kernel NULL pointer dereference, address: 0000000000000021 > [ 384.154039] #PF: supervisor read access in kernel mode > [ 384.165645] #PF: error_code(0x0000) - not-present page > [ 384.177254] PGD 16da9d067 P4D 16da9d067 PUD 13ad1a067 PMD 0 > [ 384.190036] Oops: 0000 [#1] SMP > [ 384.197117] CPU: 3 PID: 14403 Comm: CPU 3/KVM Tainted: G O 5.16.0-rc4.unstable #6 > [ 384.216978] Hardware name: LENOVO 20UF001CUS/20UF001CUS, BIOS R1CET65W(1.34 ) 06/17/2021 > [ 384.235258] RIP: 0010:amd_iommu_update_ga+0x32/0x160 > [ 384.246469] Code: <4c> 8b 62 20 48 8b 4a 18 4d 85 e4 0f 84 ca 00 00 00 48 85 c9 0f 84 > [ 384.288932] RSP: 0018:ffffc9000036fca0 EFLAGS: 00010046 > [ 384.300727] RAX: 0000000000000000 RBX: ffff88810b68ab60 RCX: ffff8881667a6018 > [ 384.316850] RDX: 0000000000000001 RSI: ffff888107476b00 RDI: 0000000000000003 RDX, a.k.a. ir_data is NULL. This check in svm_ir_list_add() if (pi->ir_data && (pi->prev_ga_tag != 0)) { implies pi->ir_data can be NULL, but neither avic_update_iommu_vcpu_affinity() nor amd_iommu_update_ga() check ir->data for NULL. amd_ir_set_vcpu_affinity() returns "success" without clearing pi.is_guest_mode /* Note: * This device has never been set up for guest mode. * we should not modify the IRTE */ if (!dev_data || !dev_data->use_vapic) return 0; so it's plausible svm_ir_list_add() could add to the list with a NULL pi->ir_data. But none of the relevant code has seen any meaningful changes since 5.15, so odds are good I broke something :-/