On Wed, Jan 02, 2019 at 03:02:04PM +0000, Kazlauskas, Nicholas wrote: > On 12/24/18 7:15 AM, Daniel Vetter wrote: > > On Fri, Dec 21, 2018 at 09:33:24AM -0500, Nicholas Kazlauskas wrote: > >> The behavior of drm_atomic_helper_cleanup_planes differs depending on > >> whether the commit was an asynchronous update or not. > >> > >> For a typical (non-async) atomic commit prepare_fb is called on the > >> new_plane_state and cleanup_fb is called on the old_plane_state. > >> > >> However, async commits are performed in place and don't swap the state > >> for the plane. The call to prepare_fb happens on the new_plane_state > >> and the call to cleanup_fb is also called on the new_plane_state in > >> this case (since the state hasn't swapped). > >> > >> This behavior can lead to use-after-free or unpin of an active fb. > >> > >> Consider the following sequence of events for interleaving fbs: > >> > >> - Async update, fb1 prepare, fb1 cleanup_fb > >> - Async update, fb2 prepare, fb2 cleanup_fb > >> - Non-async update, fb1 prepare, fb2 cleanup_fb > >> - Async update, fb2 cleanup_fb -> double cleanup, use-after-free > > > > I think I see your bug, but I'm completely lost in your description above. > > > > I think this is ok as a short-term gap, but imo better if it's a separate > > if condition with a FIXME comment. > > > > Long-term we want to fix this, and I think simplest way to do that is if > > we expect drivers to store the old fb in the new_plane_state (and check > > that with a WARN_ON like the others). I think that should work. > > > > We probably also need some locking on top, to prevent races with the > > cleanup_fb calls done by non-blocking commits, to make sure those clean up > > the right fb. > > -Daniel > > Hi Daniel, > > The description is supposed to be saying that the wrong fb is being > cleaned up because in state updates to the plane don't swap the state > pointer - which is required by the cleanup planes helper function. Yeah I inferred that, but the actual commit message isn't really clear on this, and the example you've listed only confused me. We need good commit messages, so that half a year down the road we still know what exactly we've been thinking :-) Especially in an area that's really tricky, like synchronization of the nonblocking atomic updates against normal updates (and each another), which has seen plenty of bugs in the past. > But putting that aside, I think what you suggest here would work best > for "fixing" the problem. Storing the old fb pointer in the new state > looks kind of odd, but as long as it's documented and there's the > WARN_ON in the helper I think it's fine. I think I would prefer this > over a solution that blocks fb changes in the helper altogether. > > I don't think any additional drm level locking is necessary here. The > race with cleanup_fb calls is already something you have to worry about > when dealing with async commits even if the fb hasn't changed. Most > drivers have their own internal locks or ref-counting that can handle this. > > So to summarize, I think I can post a new patch series that addresses > this problem by fixing existing async commit usage in vc4 and amdgpu for > framebuffer changes. The last patch in the series could be the one that > adds a WARN_ON in the drm_atomic_helper_async_commit helper along with a > comment indicating that the old fb should be in the new plane state. > > Does this sound reasonable to you? Let's please do the the minimal bugfixe first (probably cc: stable), and figure out how to properly fix things up long-term. -Daniel > > Nicholas Kazlauskas > > > > >> This situation has been observed in practice for a double buffered > >> cursor when closing an X client. The non-async update occurs because > >> the new_plane_state->crtc != old_plane_state->crtc which forces the > >> non-async path to occur. > >> > >> The simplest fix for this is to block fb updates in > >> drm_atomic_helper_async_check. This guarantees that the framebuffer > >> will have previously been prepared and any subsequent async updates > >> will always call prepare and cleanup_fb like the non-async atomic > >> commit path would. > >> > >> Cc: Michel Dänzer <michel@xxxxxxxxxxx> > >> Cc: Daniel Vetter <daniel.vetter@xxxxxxxx> > >> Cc: Andrey Grodzovsky <andrey.grodzovsky@xxxxxxx> > >> Cc: Harry Wentland <harry.wentland@xxxxxxx> > >> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@xxxxxxx> > >> --- > >> drivers/gpu/drm/drm_atomic_helper.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c > >> index 54e2ae614dcc..d2f80bf14f86 100644 > >> --- a/drivers/gpu/drm/drm_atomic_helper.c > >> +++ b/drivers/gpu/drm/drm_atomic_helper.c > >> @@ -1599,7 +1599,8 @@ int drm_atomic_helper_async_check(struct drm_device *dev, > >> return -EINVAL; > >> > >> if (!new_plane_state->crtc || > >> - old_plane_state->crtc != new_plane_state->crtc) > >> + old_plane_state->crtc != new_plane_state->crtc || > >> + old_plane_state->fb != new_plane_state->fb) > >> return -EINVAL; > >> > >> funcs = plane->helper_private; > >> -- > >> 2.17.1 > >> > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel