On Thu, Jun 20, 2013 at 06:18:27PM +0100, Chris Wilson wrote: > Having thrown around a few ideas for how to do PSR and FBC write > tracking on the frontbuffer, including the creation of a new SCANOUT > domain, the implementation that I've settled on is to detect writes to > either the GTT domain (hmm, a CPU write flush should also set the fb as > dirty) and then a deferred SCANOUT flush. The deferred flush then resets > the write domain and kicks the GTT mapping so that any further dumb > writes (i.e. dumb kms buffers or fbcon) cause a new frontbuffer > invalidation. This patch series also introduces framebuffer parameters > (which maybe should be properties) to allow userspace to opt out of the > dumb mechanism and elect to call dirtyfb itself when it requires the > scanout to be flushed. > > I've been playing around with GFDT on SNB and IVB as a means for > prototyping the delayed updates and framebuffer parameters. Sorry for the massive delay in writing down my review here, it's been sitting on my table for way too long. Mostly unsorted thoughts on the entire topic: - dirtyfb locking is really heavyweight, it takes all modeset locks currently. But it's easy to just push that down into driver callbacks so that we can do something saner. - It's not part of this patch series, but imo if we want to expose a hint to userspace that frontbuffer rendering might not be a good idea we should do that as a crtc or maybe plane property (Ville's primary plane conversion is missing for that to work out perfectly, but meh). - If we really need properties on framebuffer we imo should add support for that in drm core. But I don't think that's required, since we can just mark framebuffers as non-legacy as soon as we've seen the first ->dirtyfb ioctl call. Doesn't even need any locking since we can only ever upgrade from false to true. New userspace which explicitly tells the kernel when to invalidate an fb can just do a dummy dirtyfb call at alloc time (or not bother if it never does frontbuffer rendering at all). - I don't think we want to have a delayed reenable timer for old userspace which doesn't do ->dirtyfb calls. Afaics we've established that we need userspace's cooperation, and adding the complexity to reanable power saving features for legacy userspace is imo too much trouble. It's also something which we'll have a hard time to validate and regression test. And I really don't like new code which doesn't have good automated test coverage ;-) - The issue with just killing psr/fbc (i.e. not just disabling it for a bit) on the first frontbuffer rendering is that we seem to have too many false positives with the current checks. I haven't traced stuff really but at least across a pageflip the ->pin_count check is too coarse and will fire for the logical new backbuffer if we're unlucky. So I think we need to add new, precise (from the userspaces pov) tracking of when a gem buffer object is used for psr or fbc: bool fbc_target; int psr_target; I think we need to add those to the gem object and not the framebuffer object since that's the place where we can check it in the busy/set_domain ioctl. To avoid locking madness this should be protected with a new frontbuffer tracking lock. Of course we'll still first check pin_count to optimize away the lock taking for most objects. Note that we only need a bool for fbc since only ever one framebuffer object can be used for fbc, so no refcounting is needed. But potentially for psr we can display more than one framebuffer object on a crtc with psr enabled, hence the refcount. - Then we can fully disable fbc if fbc_target is set and fully disable psr if psr_target is set. And if we update them in setcrtc/setplane and for pageflips we'll always know which objects are logically the frontbuffer and won't have a mess with false-positive invalidation events due to us pipelining everthing. For legacy userspace we'd only ever reanable psr/fbc again at modeset and pageflip time. For non-legacy framebuffers (i.e. those where userspace called ->dirtyfb at least once) we'll just never set those tracking bits. There's a bit of ugliness around when userspace changes the fb from legacy to non-legacy mode while it's in use, but I guess we can fix that by simply clearing all the tracking bits when that happens (to avoid forgetting about an fbc/psr reference). - To make this really useful (and also make sure it's properly tested) I think we should abolish all the hw frontbuffer rendering tracking in the gtt and system agent. We only need the hw to invalidate fbc/psr on an mmio write to the buffer address (or an MI_FLIP), which seems to be enabled unconditionally (excluding frobbing debug registers). For fbc (I haven't looked too closely at the psr patches) we certainly need to fix up our code to no longer disable fbc temporarily across a pageflip - that was only necessary with fbc1, not with fbc2 (which we have since about gm45). This way we should be able to work around the performance downsides and for psr also be able to enable psr when we have sprites displayed on the screen. And since modern compositors use sprites a lot this is imo important. - Testing: I'm somewhat hopeful that we could check whether all this invalidation tracking works correctly with the pipe CRC computations. And if we disable all hw tracking failure to disable fbc/psr or flush stuff should _really_ show up badly even for manual testing. Cheers, Daniel > -Chris > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx