On Mon, Nov 15, 2021 at 09:38:47AM +0100, Michel DDDnzer wrote: > On 2021-11-15 07:41, Lang Yu wrote: > > On Fri, Nov 12, 2021 at 05:10:27PM +0100, Michel DDDnzer wrote: > >> On 2021-11-12 16:03, Christian König wrote: > >>> Am 12.11.21 um 15:30 schrieb Michel Dänzer: > >>>> On 2021-11-12 15:29, Michel Dänzer wrote: > >>>>> On 2021-11-12 13:47, Christian König wrote: > >>>>>> Anyway this unfortunately turned out to be work for Harray and Nicholas. In detail it's about this bug report here: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214621&data=04%7C01%7CLang.Yu%40amd.com%7C766e41a1c30544442b6b08d9a81358b0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637725623316543180%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=od%2BNksWOff%2FtsuAYZLX7lIJFQJMY2OScVqhLclPYWAQ%3D&reserved=0 > >>>>>> > >>>>>> Lang was able to reproduce the issue and narrow it down to the pin in amdgpu_display_crtc_page_flip_target(). > >>>>>> > >>>>>> In other words we somehow have an unbalanced pinning of the scanout buffer in DC. > >>>>> DC doesn't use amdgpu_display_crtc_page_flip_target AFAICT. The corresponding pin with DC would be in dm_plane_helper_prepare_fb, paired with the unpin in > >>>>> dm_plane_helper_cleanup_fb. > >>>>> > >>>>> > >>>>> With non-DC, the pin in amdgpu_display_crtc_page_flip_target is paired with the unpin in dm_plane_helper_cleanup_fb > >>>> This should say amdgpu_display_unpin_work_func. > >>> > >>> Ah! So that is the classic (e.g. non atomic) path? > >> > >> Presumably. > >> > >> > >>>>> & dce_v*_crtc_disable. One thing I notice is that the pin is guarded by if (!adev->enable_virtual_display), but the unpins seem unconditional. So could this be about virtual display, and the problem is actually trying to unpin a BO that was never pinned? > >>> > >>> Nope, my educated guess is rather that we free up the BO before amdgpu_display_unpin_work_func is called. > >>> > >>> E.g. not pin unbalance, but rather use after free. > >> > >> amdgpu_display_crtc_page_flip_target calls amdgpu_bo_ref(work->old_abo), and amdgpu_display_unpin_work_func calls amdgpu_bo_unref(&work->old_abo) only after amdgpu_bo_unpin. So what you describe could only happen if there's an imbalance elsewhere such that amdgpu_bo_unref is called more often than amdgpu_bo_ref, or maybe if amdgpu_bo_reserve fails in amdgpu_display_unpin_work_func (in which case the "failed to reserve buffer after flip" error message should appear in dmesg). > > > > > > Actually, each call to amdgpu_display_crtc_page_flip_target() will > > > > 1, init a work(amdgpu_display_unpin_work_func) to unpin an old buffer > > (crtc->primary->fb), the work will be queued in dce_vX_0_pageflip_irq(). > > > > 2, pin a new buffer, assign it to crtc->primary->fb. But how to unpin it? > > Next call. > > > > The problem is the pinned buffer of last call to > > amdgpu_display_crtc_page_flip_target() is not unpinned. > > It's unpinned in dce_v*_0_crtc_disable. I just found crtc->primary->fb is NULL when came in dce_v*_0_crtc_disable(). So it's not unpinned... > > I think I've found the problem though: dce_v*_0_crtc_do_set_base pin the BO from target_fb unconditionally, but unpin the BO from the fb parameter only if it's different from the former. So if they're the same, the BO's pin count is incremented by 1. > > > -- > Earthling Michel Dänzer | https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fredhat.com%2F&data=04%7C01%7CLang.Yu%40amd.com%7C766e41a1c30544442b6b08d9a81358b0%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637725623316543180%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=o2m1p98zVr16n58wdOWBWyEzETMAfEGqMYAtaAZozgo%3D&reserved=0 > Libre software enthusiast | Mesa and Xwayland developer