On 31.10.12 19:51, Ville Syrj?l? wrote: > On Wed, Oct 31, 2012 at 10:44:47AM -0700, Eric Anholt wrote: >> Ville Syrj?l? <ville.syrjala at linux.intel.com> writes: >> >>> On Tue, Oct 30, 2012 at 01:33:47PM -0500, Jesse Barnes wrote: >>>> The hw supports async flips through the render ring, so why not expose it? >>>> It gives us one more "tear me harder" option we can use in the DDX and >>>> for other cases where simply flipping to the latest buffer is more >>>> important than visual quality. >>> >>> The only reason I can see why anyone would really want async flips is >>> when you're restricted to double buffering. With triple buffering you >>> should be able to override the previous flip w/o tearing. >>> >>> Well, actually if you use the ring based flips, then you can't do the >>> override. My atomic page flip code can do it because it's using mmio >>> flips. There were also other reasons favoring mmio over ring. >>> >>> Once the atomic code is deemed ready, I would suggest we just nuke the >>> ring based flip code (pun intended). >> >> Can you outline what exactly your plan is for doing faster-than-vblank >> page flipping without tearing, and how it gets synchronized with >> rendering? > > The faster than vrefresh flipping simply involves overwriting the > display plane registers before they've been latched by the hardware. > This appears to work fine already. > > As far as the synchronization goes, I basically just want a callback > from the GPU when it's done with the buffer. I'm expecting to find > some kind of GPU progress interrupt that I can enable while I'm waiting > for the GPU to catch up. So I also need a FIFO to store the flip > requests in the meantime. Once the GPU tells me it's ready, I pull the > flip request from the queue and proceed with the display plane > programming. > > So the synchronization part it's still quite handwavy, and I need > to study the hardware/driver in more detail to figure out the > specifics. > That's cool. But please make sure that the behaviour will be somehow controllable by OpenGL applications, via some OpenGL extension. I can see use for different modes: a) Normal double-buffering: For deterministic, well controlled timing - That's what my type of applications need. Maximum control over what to show next, based on precise and reliable flip completion timestamps. b) Triple buffering with FIFO queueing of frames ahead, what the intel ddx currently does, unfortunately for me with totally broken timestamping, so all my users have to disable it in the xorg.conf - quite a challenge for many Apple converts, which have trouble with the concept of editing configuration files. It's useful if an app manages to render at full refresh rate on average to smooth out occassional stalls, because the gpu has one frame of completed rendering queued up in advance. Maybe this also allows for some power saving if an app can render and queue frames ahead of time as fast as possible (race to completion) and then the cpu/gpu can go to some deeper sleep state earlier? c) Your LIFO triple-buffering, as far as i understand, with dropping late frames, to reduce latency /lag for things like video games. d) Flipping without vsync = tearing. I think this is at least useful for benchmarks, although not for anything else. thanks, -mario