On Feb 24, 2012, at 10:20 PM, Felix Kuehling wrote:
On 12-02-22 11:20 AM, Felix Kuehling wrote:
On 12-02-21 07:49 PM, Mario Kleiner wrote:
On 02/21/2012 09:07 PM, Alex Deucher wrote:
[snip]
The fix looks ok to me. Mario any thoughts?
Reviewed-by: Alex Deucher<alexdeucher@xxxxxxxxx>
Hi,
the fix looks ok to me for that device, but could we make it
conditional on the AMD C-50 APU and similar pieces? It is the right
thing to do for that gpu, but for regular desktop gpus it is too
pessimistic if it defers the pageflip timestamping and completion
event for an already completed flip:
1. Makes the timestamps 1 refresh too late, causing timing sensitive
software like mine to detect false positives -- reporting skipped
frames were there weren't any. Not as bad as missing a really
skipped
frame, but still not great.
Agreed. I was going to perform some more experiments on other
hardware
to determine what the right threshold is for different hardware
generations. I hope I'll get to that this week.
I have a final version of my patch including an explanation of the
observations it's based on (attached). It's against current drm-next
from git://people.freedesktop.org/~airlied/linux.
The idea of the current logic was that it happened that the
update_pending bit isn't read back as zero at the end of the
page_flip programming function, immediately after clearing the
graphics hardware update_lock, e.g., maybe because there is some
delay between clearing that register and the double buffering taking
place. But according to the specs, if update_pending is high, ie.,
the hw has "accepted" the request for a page flip, it will complete
as long as that request still happened within the vblank. So on a
return from the page_flip function with update_pending == 1 we check
if we are still inside the vblank area, ie., if the hw will certainly
complete the double buffering within the current vblank, because the
request was made in time.
I did all my original testing of these bits with avivo hw (rv530,
r600) and with one radeon hd 5850, so i'm a bit surprised if the
avivo path would unconditionally need this adjustment. Could it be
that the observed accuracy of update_pending depends on the rest of
the hw, e.g., bus or processor speed, or bus activity? My test
machines were a MacBookPro with Core2Duo 2.3 Ghz for rv530 and a
rather old AMD Athlon-64 PC for r600.
I'm worried that your observed reliability of update_pending on >=
AVIVO asics could be an artifact of the specific computer hw you used
and that this doesn't generalize on older or different hw. If it
doesn't generalize then the patch could defer a lot of perfectly good
pageflips by 1 frame, screwing the timestamps and reducing framerate.
Here is what the rv630 register programming guide says about the
double-buffering of the surface base address registers:
D1GRPH_SURFACE_UPDATE_PENDING: "the double buffering occurs in
vertical retrace when D1GRPH_SURFACE_UPDATE_PENDING = 1 and
D1GRPH_UPDATE_LOCK = 0 and V_UPDATE = 1."
D1CRTC_V_UPDATE: "Current vertical position ... 1 = within the
V_UPDATE region (between end of vertical active display and start_line)"
For us that would mean the threshold for deferred flip completion
would need to be whatever that mentioned "start_line" is, and for the
tested avivo hw, start_line used to be == start of active scanout, so
the threshold of zero made sense.
Alex: I don't know where start_line is stored. Could it be that this
value changed due to hw changes or changes in the kms driver or
atombios? Or would it be possible to read that value back from some
register and use it as threshold?
If you look at radeon_get_crtc_scanoutpos() you can also see that the
returned vpos is corrected for some offsets. Could it be that
something changed there for the DCE4.1 or DCE5 display engine or
whatever the C-50 APU uses? That could also explain why suddenly such
a weird threshold as vpos > -4 is needed on your tablet, because the
returned vpos is offset wrongly by a few scanlines.
-mario
2. Can reduce the framerate due to throttling the client, especially
on systems that are already challenged wrt. to their irq timing.
Is the vblank period very short on these kind of devices? From Felix
description is sounds as if it is only 2 scanlines?
It looks like that.
Turns out that that's not correct. Smaller negative values of vpos
never
showed up in my log output because I didn't print it in case
update_pending was 0. The actual vblank period is 8 scan lines on this
device. Still not much compared to the ~40 I was seeing with an
external
monitor. Anything > -4 would result in flickering in my
experiments, so
only 5 scan lines worth of time are available for submitting the page
flip in time for the next frame. If I miss that time window, the
flip is
deferred by an extra frame. In practice that seems to occur in
about 25%
of cases on this particular device.
Regards,
Felix
Thanks for the feedback,
Felix
thanks,
-mario
--
_____ Felix Kuehling
\ _ | MTS Software Development Eng.
/|_| | SW-Linux Base Gfx | AMD
|__/ \| T 905.882.2600 x8928
<0001-Fix-deferred-page-flip-detection-logic-on-Avivo-base.patch>
*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany
e-mail: mario.kleiner@xxxxxxxxxxxxxxxx
office: +49 (0)7071/601-1623
fax: +49 (0)7071/601-616
www: http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel