There is a use-after-free condition due to a race between kvm_xen_eventfd_deassign() and kvm_xen_eventfd_update(): mutex_lock(&kvm->lock) evtchnfd = idr_find(...) mutex_unlock(&kvm->lock) mutex_lock(&kvm->lock) evtchnfd = idr_remove(...) mutex_unlock(&kvm->lock) synchronize_srcu(&kvm->srcu) kfree(evtchnfd) [evtchnfd is stale now] if (evtchnfd->type != data->u.evtchn.type) return -EINVAL; ... My understanding is that kvm_xen_eventfd_update() forgets to enter SRCU critical section, and thus synchronize_srcu() in kvm_xen_eventfd_deassign() does not really synchronize much, which results in prematurly kfree()ing evtchnfd. The condition is rather hard to hit (and I sure hope it is not purely theoretical), but when I throw a mdelay() between the lines, e.g. diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index d7af40240248..2b3495517c99 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -14,6 +14,7 @@ #include <linux/eventfd.h> #include <linux/kvm_host.h> #include <linux/sched/stat.h> +#include <linux/delay.h> #include <trace/events/kvm.h> #include <xen/interface/xen.h> @@ -1836,6 +1837,8 @@ static int kvm_xen_eventfd_update(struct kvm *kvm, if (!evtchnfd) return -ENOENT; + mdelay(1); + /* For an UPDATE, nothing may change except the priority/vcpu */ if (evtchnfd->type != data->u.evtchn.type) return -EINVAL; and race update/deassign IOCTLs, then KFENCE reports start popping out: [ 299.233021] ================================================================== [ 299.233043] BUG: KFENCE: use-after-free read in kvm_xen_hvm_set_attr+0x7cb/0x930 [kvm] [ 299.233566] Use-after-free read at 0x000000004c7839ea (in kfence-#134): [ 299.233581] kvm_xen_hvm_set_attr+0x7cb/0x930 [kvm] [ 299.234094] kvm_arch_vm_ioctl+0xcda/0xee0 [kvm] [ 299.234613] kvm_vm_ioctl+0x703/0x1320 [kvm] [ 299.235074] __x64_sys_ioctl+0xb8/0xf0 [ 299.235091] do_syscall_64+0x55/0x80 [ 299.235105] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 299.235128] kfence-#134: 0x000000008893823b-0x00000000c2c2058a, size=24, cache=kmalloc-32 [ 299.235143] allocated by task 941 on cpu 0 at 288.238587s: [ 299.235890] __kmem_cache_alloc_node+0x357/0x420 [ 299.235908] kmalloc_trace+0x26/0x60 [ 299.235921] kvm_xen_hvm_set_attr+0x119/0x930 [kvm] [ 299.236425] kvm_arch_vm_ioctl+0xcda/0xee0 [kvm] [ 299.236905] kvm_vm_ioctl+0x703/0x1320 [kvm] [ 299.237380] __x64_sys_ioctl+0xb8/0xf0 [ 299.237394] do_syscall_64+0x55/0x80 [ 299.237407] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 299.237427] freed by task 941 on cpu 0 at 288.576296s: [ 299.238181] kvm_xen_hvm_set_attr+0x784/0x930 [kvm] [ 299.238666] kvm_arch_vm_ioctl+0xcda/0xee0 [kvm] [ 299.239094] kvm_vm_ioctl+0x703/0x1320 [kvm] [ 299.239210] __x64_sys_ioctl+0xb8/0xf0 [ 299.239213] do_syscall_64+0x55/0x80 [ 299.239216] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 299.239222] CPU: 3 PID: 944 Comm: a.out Tainted: G B 6.1.0+ #63 [ 299.239227] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.1-1-1 04/01/2014 [ 299.239230] ================================================================== PATCH 1/2 proposes a fix. PATCH 2/2 takes the opportunity to further simplify the code a bit. Michal Luczaj (2): KVM: x86/xen: Fix use-after-free in kvm_xen_eventfd_update() KVM: x86/xen: Simplify eventfd IOCTLs arch/x86/kvm/xen.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) -- 2.39.0