Re: [PATCH] drm/vmwgfx: Keep a gem reference to user bos in surfaces

Zack Rusin <zack.rusin@xxxxxxxxxxxx> · Fri, 22 Dec 2023 20:07:39 -0500

On Thu, Dec 21, 2023 at 5:54 AM Sverdlin, Alexander
<alexander.sverdlin@xxxxxxxxxxx> wrote:
>
> Hi Zack,
>
> thank you for the patch!
>
> On Thu, 2023-09-28 at 00:13 -0400, Zack Rusin wrote:
> > From: Zack Rusin <zackr@xxxxxxxxxx>
> >
> > Surfaces can be backed (i.e. stored in) memory objects (mob's) which
> > are created and managed by the userspace as GEM buffers. Surfaces
> > grab only a ttm reference which means that the gem object can
> > be deleted underneath us, especially in cases where prime buffer
> > export is used.
> >
> > Make sure that all userspace surfaces which are backed by gem objects
> > hold a gem reference to make sure they're not deleted before vmw
> > surfaces are done with them, which fixes:
> > ------------[ cut here ]------------
> > refcount_t: underflow; use-after-free.
> > WARNING: CPU: 2 PID: 2632 at lib/refcount.c:28 refcount_warn_saturate+0xfb/0x150
> > Modules linked in: overlay vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock snd_ens1371 snd_ac97_codec ac97_bus snd_pcm gameport>
> > CPU: 2 PID: 2632 Comm: vmw_ref_count Not tainted 6.5.0-rc2-vmwgfx #1
> > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
> > RIP: 0010:refcount_warn_saturate+0xfb/0x150
> > Code: eb 9e 0f b6 1d 8b 5b a6 01 80 fb 01 0f 87 ba e4 80 00 83 e3 01 75 89 48 c7 c7 c0 3c f9 a3 c6 05 6f 5b a6 01 01 e8 15 81 98 ff <0f> 0b e9 6f ff ff ff 0f b>
> > RSP: 0018:ffffbdc34344bba0 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027
> > RDX: ffff960475ea1548 RSI: 0000000000000001 RDI: ffff960475ea1540
> > RBP: ffffbdc34344bba8 R08: 0000000000000003 R09: 65646e75203a745f
> > R10: ffffffffa5b32b20 R11: 72657466612d6573 R12: ffff96037d6a6400
> > R13: ffff9603484805b0 R14: 000000000000000b R15: ffff9603bed06060
> > FS:  00007f5fd8520c40(0000) GS:ffff960475e80000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f5fda755000 CR3: 000000010d012005 CR4: 00000000003706e0
> > Call Trace:
> >  <TASK>
> >  ? show_regs+0x6e/0x80
> >  ? refcount_warn_saturate+0xfb/0x150
> >  ? __warn+0x91/0x150
> >  ? refcount_warn_saturate+0xfb/0x150
> >  ? report_bug+0x19d/0x1b0
> >  ? handle_bug+0x46/0x80
> >  ? exc_invalid_op+0x1d/0x80
> >  ? asm_exc_invalid_op+0x1f/0x30
> >  ? refcount_warn_saturate+0xfb/0x150
> >  drm_gem_object_handle_put_unlocked+0xba/0x110 [drm]
> >  drm_gem_object_release_handle+0x6e/0x80 [drm]
> >  drm_gem_handle_delete+0x6a/0xc0 [drm]
> >  ? __pfx_vmw_bo_unref_ioctl+0x10/0x10 [vmwgfx]
> >  vmw_bo_unref_ioctl+0x33/0x40 [vmwgfx]
> >  drm_ioctl_kernel+0xbc/0x160 [drm]
> >  drm_ioctl+0x2d2/0x580 [drm]
> >  ? __pfx_vmw_bo_unref_ioctl+0x10/0x10 [vmwgfx]
> >  ? do_vmi_munmap+0xee/0x180
> >  vmw_generic_ioctl+0xbd/0x180 [vmwgfx]
> >  vmw_unlocked_ioctl+0x19/0x20 [vmwgfx]
> >  __x64_sys_ioctl+0x99/0xd0
> >  do_syscall_64+0x5d/0x90
> >  ? syscall_exit_to_user_mode+0x2a/0x50
> >  ? do_syscall_64+0x6d/0x90
> >  ? handle_mm_fault+0x16e/0x2f0
> >  ? exit_to_user_mode_prepare+0x34/0x170
> >  ? irqentry_exit_to_user_mode+0xd/0x20
> >  ? irqentry_exit+0x3f/0x50
> >  ? exc_page_fault+0x8e/0x190
> >  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> > RIP: 0033:0x7f5fda51aaff
> > Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 7>
> > RSP: 002b:00007ffd536a4d30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 00007ffd536a4de0 RCX: 00007f5fda51aaff
> > RDX: 00007ffd536a4de0 RSI: 0000000040086442 RDI: 0000000000000003
> > RBP: 0000000040086442 R08: 000055fa603ada50 R09: 0000000000000000
> > R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffd536a51b8
> > R13: 0000000000000003 R14: 000055fa5ebb4c80 R15: 00007f5fda90f040
> >  </TASK>
> > ---[ end trace 0000000000000000 ]---
> >
> > A lot of the analyis on the bug was done by Murray McAllister and
> > Ian Forbes.
> >
> > Reported-by: Murray McAllister <murray.mcallister@xxxxxxxxx>
> > Cc: Ian Forbes <iforbes@xxxxxxxxxx>
> > Signed-off-by: Zack Rusin <zackr@xxxxxxxxxx>
> > Fixes: a950b989ea29 ("drm/vmwgfx: Do not drop the reference to the handle too soon")
> > Cc: <stable@xxxxxxxxxxxxxxx> # v6.2+
>
> Do you remember the particular reason this was marked 6.2+?

That's because that's the kernel release where the commit this one is
fixing first landed.

> We see this on Debian 6.1.67 (which at least has the mentioned
> "drm/vmwgfx: Do not drop the reference to the handle too soon"):

The original had to be backported there. I'll ask someone on my team
to check the branches the original was backported to see if this
change even applies on those and then we'll see what we can do. In the
meantime if you know anyone on the Debian kernel team suggesting this
as a cherry-pick might also be a good idea.

z