On Wed, Oct 25, 2023 at 04:13:14PM -0700, Si-Wei Liu wrote: > When the vhost-vdpa device is being closed, vhost_vdpa_cleanup() doesn't > clean up the vqs pointer after free. This could lead to use-after-tree > when _compat_vdpa_reset() tries to access the vqs that are freed already. > Fix is to set vqs pointer to NULL at the end of vhost_vdpa_cleanup() > after getting freed, which is guarded by atomic opened state. > > BUG: unable to handle page fault for address: 00000001005b4af4 > #PF: supervisor read access in kernel mode > #PF: error_code(0x0000) - not-present page > PGD 16a80a067 P4D 0 > Oops: 0000 [#1] PREEMPT SMP NOPTI > CPU: 4 PID: 40387 Comm: qemu-kvm Not tainted 6.6.0-rc7+ #3 > Hardware name: Dell Inc. PowerEdge R750/0PJ80M, BIOS 1.8.2 09/14/2022 > RIP: 0010:_compat_vdpa_reset.isra.0+0x27/0xb0 [vhost_vdpa] > Code: 90 90 90 0f 1f 44 00 00 41 55 4c 8d ae 08 03 00 00 41 54 55 48 > 89 f5 53 4c 8b a6 00 03 00 00 48 85 ff 74 49 48 8b 07 4c 89 ef <48> 8b > 80 88 45 00 00 48 c1 e8 08 48 83 f0 01 89 c3 e8 73 5e 9b dc > RSP: 0018:ff73a85762073ba0 EFLAGS: 00010286 > RAX: 00000001005b056c RBX: ff32b13ca6994c68 RCX: 0000000000000002 > RDX: 0000000000000001 RSI: ff32b13c07559000 RDI: ff32b13c07559308 > RBP: ff32b13c07559000 R08: 0000000000000000 R09: ff32b12ca497c0f0 > R10: ff73a85762073c58 R11: 0000000c106f9de3 R12: ff32b12c95b1d050 > R13: ff32b13c07559308 R14: ff32b12d0ddc5100 R15: 0000000000008002 > FS: 00007fec5b8cbf80(0000) GS:ff32b13bbfc80000(0000) > knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000001005b4af4 CR3: 000000015644a003 CR4: 0000000000773ee0 > PKRU: 55555554 > Call Trace: > <TASK> > ? __die+0x20/0x70 > ? page_fault_oops+0x76/0x170 > ? exc_page_fault+0x65/0x150 > ? asm_exc_page_fault+0x22/0x30 > ? _compat_vdpa_reset.isra.0+0x27/0xb0 [vhost_vdpa] > vhost_vdpa_open+0x57/0x280 [vhost_vdpa] > ? __pfx_chrdev_open+0x10/0x10 > chrdev_open+0xc6/0x260 > ? __pfx_chrdev_open+0x10/0x10 > do_dentry_open+0x16e/0x530 > do_open+0x21c/0x400 > path_openat+0x111/0x290 > do_filp_open+0xb2/0x160 > ? __check_object_size.part.0+0x5e/0x140 > do_sys_openat2+0x96/0xd0 > __x64_sys_openat+0x53/0xa0 > do_syscall_64+0x59/0x90 > ? syscall_exit_to_user_mode+0x22/0x40 > ? do_syscall_64+0x69/0x90 > ? syscall_exit_to_user_mode+0x22/0x40 > ? do_syscall_64+0x69/0x90 > ? do_syscall_64+0x69/0x90 > ? syscall_exit_to_user_mode+0x22/0x40 > ? do_syscall_64+0x69/0x90 > ? exc_page_fault+0x65/0x150 > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > Fixes: 10cbf8dfaf93 ("vhost-vdpa: clean iotlb map during reset for older userspace") > Fixes: ac7e98c73c05 ("vhost-vdpa: fix NULL pointer deref in _compat_vdpa_reset") So these two are all in next correct? I really do not like it how 10cbf8dfaf936e3ef1f5d7fdc6e9dada268ba6bb introduced a regression and then apparently we keep fixing things up? Can I squash these 3 commits? > Reported-by: Lei Yang <leiyang@xxxxxxxxxx> > Closes: https://lore.kernel.org/all/CAPpAL=yHDqn1AztEcN3MpS8o4M+BL_HVy02FdpiHN7DWd91HwQ@xxxxxxxxxxxxxx/ > Signed-off-by: Si-Wei Liu <si-wei.liu@xxxxxxxxxx> > --- > drivers/vhost/vdpa.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c > index 9a2343c45df0..30df5c58db73 100644 > --- a/drivers/vhost/vdpa.c > +++ b/drivers/vhost/vdpa.c > @@ -1355,6 +1355,7 @@ static void vhost_vdpa_cleanup(struct vhost_vdpa *v) > vhost_vdpa_free_domain(v); > vhost_dev_cleanup(&v->vdev); > kfree(v->vdev.vqs); > + v->vdev.vqs = NULL; > } > > static int vhost_vdpa_open(struct inode *inode, struct file *filep) > -- > 2.39.3