On Fri, Feb 24, 2012 at 11:38 PM, Mario Kleiner <mario.kleiner@xxxxxxxxxxxxxxxx> wrote: > On Feb 24, 2012, at 10:20 PM, Felix Kuehling wrote: > >> On 12-02-22 11:20 AM, Felix Kuehling wrote: >>> >>> On 12-02-21 07:49 PM, Mario Kleiner wrote: >>>> >>>> On 02/21/2012 09:07 PM, Alex Deucher wrote: >>> >>> [snip] >>>>> >>>>> The fix looks ok to me. Mario any thoughts? >>>>> >>>>> Reviewed-by: Alex Deucher<alexdeucher@xxxxxxxxx> >>>>> >>>> Hi, >>>> >>>> the fix looks ok to me for that device, but could we make it >>>> conditional on the AMD C-50 APU and similar pieces? It is the right >>>> thing to do for that gpu, but for regular desktop gpus it is too >>>> pessimistic if it defers the pageflip timestamping and completion >>>> event for an already completed flip: >>>> >>>> 1. Makes the timestamps 1 refresh too late, causing timing sensitive >>>> software like mine to detect false positives -- reporting skipped >>>> frames were there weren't any. Not as bad as missing a really skipped >>>> frame, but still not great. >>> >>> Agreed. I was going to perform some more experiments on other hardware >>> to determine what the right threshold is for different hardware >>> generations. I hope I'll get to that this week. >> >> >> I have a final version of my patch including an explanation of the >> observations it's based on (attached). It's against current drm-next >> from git://people.freedesktop.org/~airlied/linux. >> > > The idea of the current logic was that it happened that the update_pending > bit isn't read back as zero at the end of the page_flip programming > function, immediately after clearing the graphics hardware update_lock, > e.g., maybe because there is some delay between clearing that register and > the double buffering taking place. But according to the specs, if > update_pending is high, ie., the hw has "accepted" the request for a page > flip, it will complete as long as that request still happened within the > vblank. So on a return from the page_flip function with update_pending == 1 > we check if we are still inside the vblank area, ie., if the hw will > certainly complete the double buffering within the current vblank, because > the request was made in time. > > I did all my original testing of these bits with avivo hw (rv530, r600) and > with one radeon hd 5850, so i'm a bit surprised if the avivo path would > unconditionally need this adjustment. Could it be that the observed accuracy > of update_pending depends on the rest of the hw, e.g., bus or processor > speed, or bus activity? My test machines were a MacBookPro with Core2Duo 2.3 > Ghz for rv530 and a rather old AMD Athlon-64 PC for r600. > > I'm worried that your observed reliability of update_pending on >= AVIVO > asics could be an artifact of the specific computer hw you used and that > this doesn't generalize on older or different hw. If it doesn't generalize > then the patch could defer a lot of perfectly good pageflips by 1 frame, > screwing the timestamps and reducing framerate. > > Here is what the rv630 register programming guide says about the > double-buffering of the surface base address registers: > > D1GRPH_SURFACE_UPDATE_PENDING: "the double buffering occurs in vertical > retrace when D1GRPH_SURFACE_UPDATE_PENDING = 1 and > D1GRPH_UPDATE_LOCK = 0 and V_UPDATE = 1." > > D1CRTC_V_UPDATE: "Current vertical position ... 1 = within the V_UPDATE > region (between end of vertical active display and start_line)" > > For us that would mean the threshold for deferred flip completion would need > to be whatever that mentioned "start_line" is, and for the tested avivo hw, > start_line used to be == start of active scanout, so the threshold of zero > made sense. > > Alex: I don't know where start_line is stored. Could it be that this value > changed due to hw changes or changes in the kms driver or atombios? Or would > it be possible to read that value back from some register and use it as > threshold? > > If you look at radeon_get_crtc_scanoutpos() you can also see that the > returned vpos is corrected for some offsets. Could it be that something > changed there for the DCE4.1 or DCE5 display engine or whatever the C-50 APU > uses? That could also explain why suddenly such a weird threshold as vpos > > -4 is needed on your tablet, because the returned vpos is offset wrongly by > a few scanlines. > Looks like there was a change for DCE4.1/5.0. DCE1/2/3: CRTC:D1CRTC_START_LINE_CONTROL · [R/W] · 32 bits · Access: 8/16/32 · MMReg:0x60d8 DESCRIPTION: move start_line signal earlier by 1 line in CRTC1 Field Name Bits Default Description D1CRTC_PROGRESSIVE_START_LINE_EARLY 0 0x0 move start_line signal by 1 line eariler in progressive mode D1CRTC_INTERLACE_START_LINE_EARLY 8 0x1 move start_line signal by 1 line earlier in interlaced timing mode DCE4.0: CRTC:CRTC_START_LINE_CONTROL · [R/W] · 32 bits · Access: 8/16/32 · [INST0] GpuF0MMReg:0x6ecc, [INST1] GpuF0MMReg:0x7acc, [INST2] GpuF0MMReg:0x106cc, [INST3] GpuF0MMReg:0x112cc, [INST4] GpuF0MMReg:0x11ecc, [INST5] GpuF0MMReg:0x12acc DESCRIPTION: move start_line signal earlier by 1 line in CRTC Field Name Bits Default Description CRTC_PROGRESSIVE_START_LINE_EARLY 0 0x0 move start_line signal by 1 line eariler in progressive mode CRTC_INTERLACE_START_LINE_EARLY 8 0x1 move start_line signal by 1 line earlier in interlaced timing mode DCE4.1/DCE5: CRTC:CRTC_START_LINE_CONTROL · [R/W] · 32 bits · Access: 8/16/32 · [INST0] GpuF0MMReg:0x6ecc, [INST1] GpuF0MMReg:0x7acc, [INST2] GpuF0MMReg:0x106cc, [INST3] GpuF0MMReg:0x112cc, [INST4] GpuF0MMReg:0x11ecc, [INST5] GpuF0MMReg:0x12acc DESCRIPTION: move start_line signal earlier by 1 line in CRTC Field Name Bits Default Description CRTC_PROGRESSIVE_START_LINE_EARLY 0 0x0 move start_line signal by 1 line eariler in progressive mode CRTC_INTERLACE_START_LINE_EARLY 8 0x1 move start_line signal by 1 line earlier in interlaced timing mode CRTC_ADVANCED_START_LINE_POSITION 19:16 0x4 Advanced Start Line position respect to not VBLANK. Default of 4 means the Advanced Start Line is 4 lines before the first non VBLANK line Also bit 4 (CRTC_V_START_LINE) of CRTC_STATUS will tell you when whether the current line is outside (0) or inside (1) the start line region. Alex > -mario > > >>> >>>> 2. Can reduce the framerate due to throttling the client, especially >>>> on systems that are already challenged wrt. to their irq timing. >>>> >>>> Is the vblank period very short on these kind of devices? From Felix >>>> description is sounds as if it is only 2 scanlines? >>> >>> It looks like that. >> >> >> Turns out that that's not correct. Smaller negative values of vpos never >> showed up in my log output because I didn't print it in case >> update_pending was 0. The actual vblank period is 8 scan lines on this >> device. Still not much compared to the ~40 I was seeing with an external >> monitor. Anything > -4 would result in flickering in my experiments, so >> only 5 scan lines worth of time are available for submitting the page >> flip in time for the next frame. If I miss that time window, the flip is >> deferred by an extra frame. In practice that seems to occur in about 25% >> of cases on this particular device. >> >> Regards, >> Felix >> >>> Thanks for the feedback, >>> Felix >>> >>>> thanks, >>>> -mario >>>> >> >> -- >> _____ Felix Kuehling >> \ _ | MTS Software Development Eng. >> /|_| | SW-Linux Base Gfx | AMD >> |__/ \| T 905.882.2600 x8928 >> >> <0001-Fix-deferred-page-flip-detection-logic-on-Avivo-base.patch> > > > ********************************************************************* > Mario Kleiner > Max Planck Institute for Biological Cybernetics > Spemannstr. 38 > 72076 Tuebingen > Germany > > e-mail: mario.kleiner@xxxxxxxxxxxxxxxx > office: +49 (0)7071/601-1623 > fax: +49 (0)7071/601-616 > www: http://www.kyb.tuebingen.mpg.de/~kleinerm > ********************************************************************* > "For a successful technology, reality must take precedence > over public relations, for Nature cannot be fooled." > (Richard Feynman) > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel