On Thu, Nov 19, 2015 at 06:35:04PM -0200, Paulo Zanoni wrote: > 2015-11-19 18:06 GMT-02:00 Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>: > > On Thu, Nov 19, 2015 at 05:44:51PM -0200, Paulo Zanoni wrote: > >> 2014-05-26 11:26 GMT-03:00 <ville.syrjala@xxxxxxxxxxxxxxx>: > >> > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > >> > > >> > Now that the vblank races are plugged, we can opt out of using > >> > the vblank disable timer and just let vblank interrupts get > >> > disabled immediately when the last reference is dropped. > >> > > >> > Gen2 is the exception since it has no hardware frame counter. > >> > >> Hi > >> > >> Remember last week's FBC vblank optimization patch that had an > >> erroneous drm_crtc_vblank_get() instead of drm_crtc_vblank_count()? > >> After I fixed the bug in the patch I realized that it was the > >> unbalanced vblank_get() call that moved PC state residency up. > >> > >> I did some experiments, and on my specific BDW machine, after running > >> "powertop --auto-tune", I get about 15-25% PC7 residency without FBC. > >> If I revert this patch, the number jumps to 40-45%. With FBC, the PC7 > >> residency goes from 60-70% to 85-90% when I revert this patch. I'm > >> running just an idle Cinnamon with an open terminal. > >> > >> So, since the commit message lacks more details, what are the > >> downsides of reverting this patch? What are the advantages of opting > >> out of the vblank timer? I see my desktop does tons and tons of vblank > >> get/put calls per second, so the disable timer makes a lot of sense. > > > > "Idle" desktop :( > > My first realization of this little problem was when I was > implementing runtime PM :) > > > > > > Really the immediate disable should save power. Where are these tons of > > vblank get/puts coming from actually? > > I'll take a finer look tomorrow, but I assume it's probably some > application redrawing. I see it does calm down sometimes, but that's > not enough to get better PC7 residency. > > > > I would assume you'd get a handful > > per frame at most, and that when you're actually doing something. On an > > idle system I would expect nothing at all happens during most frames. > > > > Not sure, but I guess it's possible the extra register accesses in the > > get/puts actually cause the display to exit low power states all the time, > > or something. > > I tried replacing the register macros with the _FW version and that didn't help. Well, that would just get rid of the unclaimed reg checks. Nothing more I think. > > > > > > There's also this note in Bspec (for HSW at least): > > I think this not is present on most (all?) gens. Doesn't really prove anything. > > "Workaround : Do not enable and unmask this interrupt if the associated > > pipe is disabled. Do not leave this interrupt enabled and unmasked > > after the associated pipe is disabled." > > which we took to mean that having the interrupt masked but enabled is > > fine. > > I'm aware of this, but I think the problem is that the resources > drained by the constant enable+disable+enable+disable outweigh the > resources saved by turning off vblanks. Well the CPU is awake anyway doing the get/put, so not sure why a a few extra register accesses there would have such a huge impact. > Not sure if there's an extra > reason why BSpec asks us to immediately disable vblanks though... > > So, to summarize, the main (only?) reason is the BSpec comment? The point is not to wake up due to interrupts when we don't need them. > > > > But maybe we'd actually have to frob IER too to avoid wasting > > power somehow? > > With the interrupt masked on IMR, I don't think IER matters. I'm not sure anyone actually verified that. > > > > >> I also wish there was some easy way to check how this patch (or its > >> revert) affect a bunch of different workloads... > >> > >> (Also CCing Chris for insightful comments on performance) > > > > IIRC Chris had a patch to not disable the interrupt immediately when > > the refcount drops to 0, but instead delay the disable until the next > > interrupt actually happens. But I guess it didn't go in? Probably I > > should have reviewed it but didn't. It sounds like a decent idea to > > me in any case for the active use case. > > > >> > >> Thanks, > >> Paulo > >> > >> > > >> > Signed-off-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > >> > --- > >> > drivers/gpu/drm/i915/i915_irq.c | 8 ++++++++ > >> > 1 file changed, 8 insertions(+) > >> > > >> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > >> > index 28bae6e..4b2e7af 100644 > >> > --- a/drivers/gpu/drm/i915/i915_irq.c > >> > +++ b/drivers/gpu/drm/i915/i915_irq.c > >> > @@ -4364,6 +4364,14 @@ void intel_irq_init(struct drm_device *dev) > >> > dev->max_vblank_count = 0xffffff; /* only 24 bits of frame count */ > >> > } > >> > > >> > + /* > >> > + * Opt out of the vblank disable timer on everything except gen2. > >> > + * Gen2 doesn't have a hardware frame counter and so depends on > >> > + * vblank interrupts to produce sane vblank seuquence numbers. > >> > + */ > >> > + if (!IS_GEN2(dev)) > >> > + dev->vblank_disable_immediate = true; > >> > + > >> > if (drm_core_check_feature(dev, DRIVER_MODESET)) { > >> > dev->driver->get_vblank_timestamp = i915_get_vblank_timestamp; > >> > dev->driver->get_scanout_position = i915_get_crtc_scanoutpos; > >> > -- > >> > 1.8.5.5 > >> > > >> > _______________________________________________ > >> > Intel-gfx mailing list > >> > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > >> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx > >> > >> > >> > >> -- > >> Paulo Zanoni > > > > -- > > Ville Syrjälä > > Intel OTC > > > > -- > Paulo Zanoni -- Ville Syrjälä Intel OTC _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel