Re: Use after free with GEM shadow-buffered planes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 17 Nov 2023 at 15:08, Thomas Zimmermann <tzimmermann@xxxxxxx> wrote:
>
> Hi
>
> Am 15.11.23 um 17:32 schrieb Alyssa Ross:
> > [Originally reported at https://gitlab.freedesktop.org/drm/misc/-/issues/33]
> >
> > The following happens in a cycle:
> >
> >   • An atomic state is allocated
> >   • A plane state is allocated (drm_gem_duplicate_shadow_plane_state())
> >   • Commit (drm_atomic_helper_commit(), possibly nonblocking / asynchronously)
> >   • The previous plane state is freed (drm_gem_destroy_shadow_plane_state())
> >   • The atomic state is put
>
> We acquire a reference on the state at
>
>
> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L2057
>
> and release it at
>
>
> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L1833
>
> All new sub-state, such as for planes, should be kept allocated during
> that time.
>
> >
> > But what happens if a nonblocking commit doesn't get scheduled until a
> > couple of iterations later in the cycle?  Plane states are not
> > refcounted, so by that point, the plane state has been freed, and so
> > commit_tail() will encounter a use after free when it accesses the plane
> > state.
>
> I think it's assumed that the old plane state can be gone by the time
> the commit happens. But why would the new plane state? Did you figure
> this out in your analysis?
>
> >
> > I encountered this issue using simpledrm on the Asahi kernel based on
> > v6.5, but none of the files I examined to determine that this is a
> > use-after-free have been modified from mainline.  I've also reviewed the
> > diff between my kernel and tip of mainline (8f6f76a6a29f), and didn't
> > see anything that would affect this issue.
>
> It could be that we're passing the wrong state at
>
>
> https://elixir.bootlin.com/linux/v6.5/source/drivers/gpu/drm/drm_atomic_helper.c#L2919

Yeah that's buggy, and I think we we discussed this, I missed it. So
the lifetime rules are somewhat tricky:

- An atomic commit is only allowed to look at the subsequent state
objects until it signals comletion of that commit with the hw_done()
helper. After that point ownership of those state structures is handed
to the next commit, which can release them.

- The next commit has to wait for hw_done completions of the previous
commits, so that it doesn't release state objects too early.

- This means that when you're after hw_done() in the commit code, you
can only look at the old state structures. This means that
->cleanup_fb is fine, but the ->end_fb_access case is not. It needs to
be moved much earlier, probably into the varios commit_planes()
functions (we have two of those iirc).

- But ->end_fb_access _is_ in the right place for the case where we've
aborted a commit before the point of no return.

> in some corner case, but I really don't see how.
>
> At least that's how it's supposed to work. I'm still trying to get an
> idea how you end up with an invalid plane state.

It's indeed a very tricky situation here, but I think I understand the
bug. Maybe we need a bunch more commits in relevant places to explain
this all ..

Cheers, Sima

>
> Best regards
> Thomas
>
> >
> > Here's an example of a use after free.  It's been a couple of weeks
> > since I thoroughly investigated this, but from memory, in this case, the
> > plane state has been overwritten by a struct drm_crtc_state.
> >
> > Unable to handle kernel paging request at virtual address 0000000100000049
> > Mem abort info:
> >    ESR = 0x0000000096000005
> >    EC = 0x25: DABT (current EL), IL = 32 bits
> >    SET = 0, FnV = 0
> >    EA = 0, S1PTW = 0
> >    FSC = 0x05: level 1 translation fault
> > Data abort info:
> >    ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> >    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> >    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > user pgtable: 16k pages, 48-bit VAs, pgdp=000000080e0e31b0
> > [0000000100000049] pgd=080000083d390003, p4d=080000083d390003, pud=080000083db9c003, pmd=0000000000000000
> > Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> > Modules linked in: overlay uas usb_storage usbhid rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device bnep des_generic libdes md4 brcmfmac_wcc joydev hci_bcm4377 bluetooth brcmfmac brcmutil cfg80211 hid_magicmouse ecdh_generic ecc rfkill snd_soc_macaudio macsmc_hid macsmc_power macsmc_reboot ofpart spi_nor apple_isp videobuf2_dma_sg snd_soc_cs42l84 snd_soc_tas2764 videobuf2_memops clk_apple_nco snd_soc_apple_mca apple_admac videobuf2_v4l2 videodev videobuf2_common mc hid_apple pwm_apple leds_pwm apple_soc_cpufreq xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6t_rpfilter ipt_rpfilter xt_pkttype xt_LOG nf_log_syslog nft_compat nf_tables nfnetlink loop tun tap macvlan bridge stp llc fuse zstd zram dm_crypt xhci_plat_hcd xhci_hcd nvmem_spmi_mfd rtc_macsmc gpio_macsmc tps6598x dockchannel_hid simple_mfd_spmi regmap_spmi nvme_apple phy_apple_atc dwc3 pcie_apple typec pci_host_common udc_core apple_sart macsmc_rtkit apple_rtkit_helper apple_dockchannel macsmc apple_rtkit mfd_core
> >   spmi_apple_controller nvmem_apple_efuses pinctrl_apple_gpio spi_apple i2c_apple apple_dart apple_mailbox btrfs xor xor_neon raid6_pq
> > CPU: 0 PID: 1095074 Comm: kworker/u16:11 Tainted: G S                 6.5.0-asahi #1-NixOS
> > Hardware name: Apple MacBook Pro (13-inch, M2, 2022) (DT)
> > Workqueue: events_unbound commit_work
> > pstate: 21400009 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> > pc : drm_gem_fb_vunmap+0x18/0x74
> > lr : drm_gem_end_shadow_fb_access+0x1c/0x2c
> > sp : ffff800087ea3d00
> > x29: ffff800087ea3d00 x28: 0000000000000000 x27: 0000000000000000
> > x26: ffff800081325000 x25: 00000000fffffef7 x24: ffff000046c5b560
> > x23: ffff000001fcaa05 x22: 0000000000000000 x21: 0000000100000001
> > x20: ffff000046c5b500 x19: 0000000000000001 x18: 0000000000000000
> > x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff2e2d5ab0
> > x14: 0000000000000195 x13: 0000000000000000 x12: ffff800081310a80
> > x11: 0000000000000001 x10: 1444e7e23f083897 x9 : 6e82f0b7605f292f
> > x8 : ffff0001249e0f48 x7 : 0000000000000004 x6 : 0000000000000190
> > x5 : 0000000000000001 x4 : ffff000093c54440 x3 : ffff00000e968000
> > x2 : ffff80008077883c x1 : ffff00009ce37498 x0 : 0000000100000001
> > Call trace:
> >   drm_gem_fb_vunmap+0x18/0x74
> >   drm_gem_end_shadow_fb_access+0x1c/0x2c
> >   drm_atomic_helper_cleanup_planes+0x58/0xd8
> >   drm_atomic_helper_commit_tail+0x90/0xa0
> >   commit_tail+0x15c/0x188
> >   commit_work+0x14/0x20
> >   process_one_work+0x1e0/0x344
> >   worker_thread+0x68/0x424
> >   kthread+0xf4/0x100
> >   ret_from_fork+0x10/0x20
> > Code: 910003fd a90153f3 f90013f5 aa0003f5 (f9402400)
> > ---[ end trace 0000000000000000 ]---
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux