On 12/14/2018 02:17 PM, Kazlauskas, Nicholas wrote: > On 12/14/18 2:06 PM, Grodzovsky, Andrey wrote: >> In general I agree with Michel that DRM solution is required to >> properly address this but since now it's not really obvious what is the >> proper solution it seems to me OK to go with this fix until it's found. >> >> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> >> >> Andrey > Thanks for the review. > > FWIW, we're not the only ones with the fb change check like this - msm > does it too. The only other user of atomic_async_check and > atomic_async_update is vc4 and they don't have it but I'd imagine they > see a similar bug with interleaving framebuffers. > > It may be difficult to develop a "fix" for the behavior in DRM given the > semantics of the function (in place update vs full swap). The > old_plane_state is technically plane->state in this case, so even just > adding cleanup_fb(plane, old_plane_state) after atomic_async_update > isn't enough. What *should* be done here is the full state swap like in > a regular atomic commit but that may cause breakages in other drivers. Your change effectively does that for AMDGPU by forcing non async update for when new_plane->state->fb != curret_plane->state->fb. But after looking more into it looks to me that this fix is the generic solution, I don't see any way around it, if you swap the fb to a new one you must not unreference it until after a new fb arrives and set as current swapping out this one (in some following commit). Why you think that making this the generic solution by moving this from dm_plane_atomic_async_check to drm_atomic_helper_async_check will break other drivers ? Andrey > > Copying plane->state and calling cleanup_fb on that would also work to > fix it, but the behavior is certainly unintuitive and asking for worse > bugs than this one to pop up since nothing else in DRM works like that. > > Nicholas Kazlauskas > >> >> On 12/14/2018 12:51 PM, Kazlauskas, Nicholas wrote: >>> On 12/14/18 12:47 PM, Grodzovsky, Andrey wrote: >>>> On 12/14/2018 12:41 PM, Kazlauskas, Nicholas wrote: >>>>> On 12/14/18 12:34 PM, Grodzovsky, Andrey wrote: >>>>>> On 12/14/2018 12:26 PM, Nicholas Kazlauskas wrote: >>>>>>> [Why] >>>>>>> The behavior of drm_atomic_helper_cleanup_planes differs depending on >>>>>>> whether the commit was asynchronous or not. When it's called from >>>>>>> amdgpu_dm_atomic_commit_tail during a typical atomic commit the >>>>>>> plane state has been swapped so it calls cleanup_fb on the old plane >>>>>>> state. >>>>>>> >>>>>>> However, in the asynchronous commit codepath the call to >>>>>>> drm_atomic_helper_commit also calls dm_plane_helper_cleanup_fb after >>>>>>> atomic_async_update has been called. Since the plane state is updated >>>>>>> in place and has not been swapped the cleanup_fb call affects the new >>>>>>> plane state. >>>>>>> >>>>>>> This results in a use after free for the given sequence: >>>>>>> >>>>>>> - Fast update, fb1 pin/ref, fb1 unpin/unref >>>>>>> - Fast update, fb2 pin/ref, fb2 unpin/unref >>>>>>> - Slow update, fb1 pin/ref, fb2 unpin/unref >>>>>> Shouldn't you have use after free already at this point ? >>>>>> >>>>>> Andrey >>>>> This assumes there was 1 reference on the bo. You would be getting use >>>>> after free on every single update (be it fast or slow) if this wasn't >>>>> the case. >>>> Why would I get it on every single update if it's all balanced - first >>>> pin/ref then unpin/unref ? >>> It's balanced, but has a reference not from the atomic code path, ie. >>> from creation. >>> >>> So this is actually: >>> >>> fb1 pin, refcount=2, b1 unpin refcount=1 >>> >>>>> So in the case where there was 1 reference on fb2 it's actually freed at >>>>> the end of slow update since the ref count is now 0. Then the use after >>>>> free happens: >>>> I still don't understand where exactly the 1 reference on fb2 during >>>> slow update comes form >>>> if all i see before that is 'Fast update, fb2 pin/ref, fb2 unpin/unref' >>>> - can you clarify that ? >>>> >>>> Andrey >>> There isn't any pin/ref on fb2 during the slow update. The pin/ref >>> happens during a prepare_fb call - which is *always* called on >>> new_plane_state. So the state looks like this in commit tail: >>> >>> old_plane_state->fb = fb2 >>> new_plane_state->fb = fb1 >>> >>> Then at the end during cleanup planes, cleanup_fb is called on >>> old_plane_state (fb2). >>> >>> Nicholas Kazlauskas >>> >>>>>>> - Fast update, fb2 pin/ref -> use after free. bug >>>>> ...here. >>>>> >>>>> You can see this exact sequence happening in Michel's log. >>>>> >>>>> Nicholas Kazlauskas >>>>> >>>>>>> [How] >>>>>>> Disallow framebuffer changes in the fast path. Since this includes >>>>>>> a NULL framebuffer, this means that only framebuffers that have >>>>>>> been previously pin+ref at least once will be used, preventing a >>>>>>> use after free. >>>>>>> >>>>>>> This has a significant throughput reduction for cursor updates where >>>>>>> the framebuffer changes. For most desktop usage this isn't a problem, >>>>>>> but it does introduce performance regressions for two specific IGT >>>>>>> tests: >>>>>>> >>>>>>> - cursor-vs-flip-toggle >>>>>>> - cursor-vs-flip-varying-size >>>>>>> >>>>>>> Cc: Leo Li <sunpeng.li@xxxxxxx> >>>>>>> Cc: Harry Wentland <harry.wentland@xxxxxxx> >>>>>>> Cc: Michel Dänzer <michel@xxxxxxxxxxx> >>>>>>> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 10 ++++++++++ >>>>>>> 1 file changed, 10 insertions(+) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>>>> index d01315965af0..dc1eb9ec2c38 100644 >>>>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c >>>>>>> @@ -3628,10 +3628,20 @@ static int dm_plane_atomic_check(struct drm_plane *plane, >>>>>>> static int dm_plane_atomic_async_check(struct drm_plane *plane, >>>>>>> struct drm_plane_state *new_plane_state) >>>>>>> { >>>>>>> + struct drm_plane_state *old_plane_state = >>>>>>> + drm_atomic_get_old_plane_state(new_plane_state->state, plane); >>>>>>> + >>>>>>> /* Only support async updates on cursor planes. */ >>>>>>> if (plane->type != DRM_PLANE_TYPE_CURSOR) >>>>>>> return -EINVAL; >>>>>>> >>>>>>> + /* >>>>>>> + * DRM calls prepare_fb and cleanup_fb on new_plane_state for >>>>>>> + * async commits so don't allow fb changes. >>>>>>> + */ >>>>>>> + if (old_plane_state->fb != new_plane_state->fb) >>>>>>> + return -EINVAL; >>>>>>> + >>>>>>> return 0; >>>>>>> } >>>>>>> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx