On Mon, Nov 23, 2015 at 04:23:21PM +0100, Mario Kleiner wrote: > On 11/20/2015 04:34 PM, Ville Syrjälä wrote: > > On Fri, Nov 20, 2015 at 04:24:50PM +0100, Mario Kleiner wrote: > > ... > > >> What we do in radeon-kms is similar. If DRM_CALLED_FROM_VBLIRQ and we > >> are no more than 1% of the display height away from start of vblank we > >> fudge scanout position in a way so that the timestamp gets computed for > >> the soon-to-begin vblank, not the old one. > > > > The problem with basing that fudging purely on DRM_CALLED_FROM_VBLIRQ is > > that if you call .get_scanout_positon() from a non-irq context between > > the irq firing and start of vblank, you'll think you're still in the > > previous frame. Hence my suggestion to note down the frame counter when > > called from the irq, and then keep doing the fudging until you've truly > > crossed the start of vblank. > > > > Ok, but why would that be a bad thing? I think we want it to think it is > in the previous frame if it is called outside the vblank irq context. > The only reason we fudge it to the next frames vblank if i vblank irq is > because we know the vblank irq handler we are executing atm. was meant > to execute within the upcoming vblank for the next frame, so we fudge > the scanout positions and thereby timestamp to correspond to that new > frame. But if something called outside irq context it should get a > scanout position/timestamp that corresponds to "reality". It would be a bad thing since it would cause the timestamp to jump backwards, and that would also cause the frame count guesstimate to go backwards. > > It would maybe be a problem to get different answers for a query at the > same scanout position if we used .get_scanout_position() for something > else than for calculating vblank timestamps, but we don't do that atm. > > Maybe i'm overlooking something here related to the somewhat rewritten > code from the last year or so? But in the original design this would be > exactly what i intended? > > ... > > >> So it's good enough for typical desktop > >> applications/compositors/games/media players, and a nice improvement > >> over the previous state, but not quite sufficient for applications that > >> need long time consistent vblank counts for long waits of multiple > >> seconds or even minutes. So it's still good to have hw counters if we > >> can get some that are reliable enough. > > > > Ah, I didn't realize you care about small errors in the counter for > > such long periods of vblank off. > > > > Actually, you are right, i was stupid - not enough sleep last friday. I > do care about such small errors, but the long vblank off periods don't > matter at all the way my software works. I query the current count and > timestamp (glXGetSyncValuesOML), calculate a target count based on those > and then schedule a swap for that target count via glXSwapBuffersMscOML. > That swapbuffers call will keep vblank irqs on until the kms pageflip is > queued. So i only care about vblank counts/timestamps being consistent > for short amounts of time, typically 1 video refresh from vblank query > to queueing a vblank event, and then from reception of that > event/queuing the pageflip to pageflip completion event. So even if the > system would be heavily loaded and my code would have big preemption > delays i think counts that are consistent over a few seconds would be > enough to keep things working. Otherwise it wouldn't work now either > with a vblank off after 5 seconds and nouveau not having vblank hw counters. > > -mario > > > >> > >> -mario > >> > >> > >>>> > >>>> -mario > >>>> > >>>>>> > >>>>>> It almost sort of works on the rs600 code path, but i need a bit of info > >>>>>> from you: > >>>>>> > >>>>>> 1. There's this register from the old specs for m76.pdf, which is not > >>>>>> part of the current register defines for radeon-kms: > >>>>>> > >>>>>> "D1CRTC_STATUS_VF_COUNT - RW - 32 bits - [GpuF0MMReg:0x60A8]" > >>>>>> > >>>>>> It contains the lower 16 bits of framecounter and the 13 bits of > >>>>>> vertical scanout position. It seems to give the same readings as the 24 > >>>>>> bit R_0060A4_D1CRTC_STATUS_FRAME_COUNT we use for the hw counter. This > >>>>>> would come handy. > >>>>>> > >>>>>> Does Evergreen and later have a same/similar register and where is it? > >>>>>> > >>>>>> 2. The hw framecounter seems to increment when the vertical scanout > >>>>>> position wraps back from (VTOTAL - 1) to 0, at least on the one DCE-3 > >>>>>> gpu i tested so far. Is this so on all asics? And is the hw counter > >>>>>> increment happening exactly at the moment that vertical scanout position > >>>>>> jumps back to zero, ie. both events are driven by the same signal? Or is > >>>>>> the framecounter increment just happening somewhere inside either > >>>>>> scanline VTOTAL-1 or scanline 0? > >>>>>> > >>>>>> > >>>>>> If we can fix this and get it into rc2 or rc3 then we could avoid a bad > >>>>>> regression and with a bit of luck at the same time improve by being able > >>>>>> to set dev->vblank_disable_immediate = true then and allow vblank irqs > >>>>>> to get turned off more aggressively for a bit of extra power saving. > >>>>>> > >>>>>> thanks, > >>>>>> -mario > >>>>> > >>>>>> -- a/drivers/gpu/drm/drm_irq.c > >>>>>> +++ b/drivers/gpu/drm/drm_irq.c > >>>>>> @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > >>>>>> unsigned long flags) > >>>>>> { > >>>>>> struct drm_vblank_crtc *vblank = &dev->vblank[pipe]; > >>>>>> - u32 cur_vblank, diff; > >>>>>> + u32 cur_vblank, diff = 0; > >>>>>> bool rc; > >>>>>> struct timeval t_vblank; > >>>>>> + const struct timeval *t_old; > >>>>>> + u64 diff_ns; > >>>>>> int count = DRM_TIMESTAMP_MAXRETRIES; > >>>>>> int framedur_ns = vblank->framedur_ns; > >>>>>> > >>>>>> @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > >>>>>> rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, flags); > >>>>>> } while (cur_vblank != dev->driver->get_vblank_counter(dev, pipe) && --count > 0); > >>>>>> > >>>>>> - if (dev->max_vblank_count != 0) { > >>>>>> - /* trust the hw counter when it's around */ > >>>>>> - diff = (cur_vblank - vblank->last) & dev->max_vblank_count; > >>>>>> - } else if (rc && framedur_ns) { > >>>>>> - const struct timeval *t_old; > >>>>>> - u64 diff_ns; > >>>>>> - > >>>>>> + /* > >>>>>> + * Always use vblank timestamping based method if supported to reject > >>>>>> + * redundant vblank irqs. E.g., AMD hardware needs this to not screw up > >>>>>> + * due to some irq handling quirk. > >>>>> > >>>>> Hmm. I'm thinking it would be better to simply not claim that there's a hw > >>>>> counter if it isn't reliable. > >>>>> > >>>>>> + * > >>>>>> + * This also sets the diff value for use as fallback below in case the > >>>>>> + * hw does not support a suitable hw vblank counter. > >>>>>> + */ > >>>>>> + if (rc && framedur_ns) { > >>>>>> t_old = &vblanktimestamp(dev, pipe, vblank->count); > >>>>>> diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old); > >>>>>> > >>>>>> @@ -212,11 +216,28 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > >>>>>> */ > >>>>>> diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns); > >>>>>> > >>>>>> - if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) > >>>>>> - DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored." > >>>>>> - " diff_ns = %lld, framedur_ns = %d)\n", > >>>>>> - pipe, (long long) diff_ns, framedur_ns); > >>>>>> - } else { > >>>>>> + if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) { > >>>>>> + DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored." > >>>>>> + " diff_ns = %lld, framedur_ns = %d)\n", > >>>>>> + pipe, (long long) diff_ns, framedur_ns); > >>>>>> + > >>>>>> + /* Treat this redundant invocation as no-op. */ > >>>>>> + WARN_ON_ONCE(cur_vblank != vblank->last); > >>>>>> + return; > >>>>>> + } > >>>>>> + } > >>>>>> + > >>>>>> + /* > >>>>>> + * hw counters, if marked as supported via max_vblank_count != 0, > >>>>>> + * *must* increment at leading edge of vblank and in sync with > >>>>>> + * the firing of the hw vblank irq, otherwise we have a race here > >>>>>> + * between gpu and us, where small variations in our execution > >>>>>> + * timing can cause off-by-one miscounting of vblanks. > >>>>>> + */ > >>>>>> + if (dev->max_vblank_count != 0) { > >>>>>> + /* trust the hw counter when it's around */ > >>>>>> + diff = (cur_vblank - vblank->last) & dev->max_vblank_count; > >>>>>> + } else if (!(rc && framedur_ns)) { > >>>>>> /* some kind of default for drivers w/o accurate vbl timestamping */ > >>>>>> diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > >>>>>> } > >>>>> > >>>>> > >>> > > -- Ville Syrjälä Intel OTC _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel