On Thu, Nov 19, 2015 at 10:06:01PM +0200, Ville Syrjälä wrote: > On Thu, Nov 19, 2015 at 05:44:51PM -0200, Paulo Zanoni wrote: > > 2014-05-26 11:26 GMT-03:00 <ville.syrjala@xxxxxxxxxxxxxxx>: > > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > > > Now that the vblank races are plugged, we can opt out of using > > > the vblank disable timer and just let vblank interrupts get > > > disabled immediately when the last reference is dropped. > > > > > > Gen2 is the exception since it has no hardware frame counter. > > > > Hi > > > > Remember last week's FBC vblank optimization patch that had an > > erroneous drm_crtc_vblank_get() instead of drm_crtc_vblank_count()? > > After I fixed the bug in the patch I realized that it was the > > unbalanced vblank_get() call that moved PC state residency up. > > > > I did some experiments, and on my specific BDW machine, after running > > "powertop --auto-tune", I get about 15-25% PC7 residency without FBC. > > If I revert this patch, the number jumps to 40-45%. With FBC, the PC7 > > residency goes from 60-70% to 85-90% when I revert this patch. I'm > > running just an idle Cinnamon with an open terminal. > > > > So, since the commit message lacks more details, what are the > > downsides of reverting this patch? What are the advantages of opting > > out of the vblank timer? I see my desktop does tons and tons of vblank > > get/put calls per second, so the disable timer makes a lot of sense. > > "Idle" desktop :( > > Really the immediate disable should save power. Where are these tons of > vblank get/puts coming from actually? I would assume you'd get a handful > per frame at most, and that when you're actually doing something. On an > idle system I would expect nothing at all happens during most frames. Yes, with the immediate disable we end up with a few get/put per frame of rendering, as userspace queries and queues the next flip/swap. With more clients, there are more opportunities for queries prior to queuing the swap. It really shouldn't be more than a handful per frame if all the clients are vrefresh limited. (The oddity was the vblank_mode=0 rendering where we still maintained the handful of get/put per frame...) > > I also wish there was some easy way to check how this patch (or its > > revert) affect a bunch of different workloads... Maybe add the PC residencies to the power meter in intel-gpu-overlay, alongside the RC residencies? > > (Also CCing Chris for insightful comments on performance) > > IIRC Chris had a patch to not disable the interrupt immediately when > the refcount drops to 0, but instead delay the disable until the next > interrupt actually happens. But I guess it didn't go in? Probably I > should have reviewed it but didn't. It sounds like a decent idea to > me in any case for the active use case. The discussion went off into the barriers around the vblank counter, and I forgot to bring it up again. I think even Mario was happy enough about it. Paulo, you may want to try http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=nightly&id=ef97f4680d316cc8f9656dded1e1e41544c7225e or you can change the KEEPALIVES value in src/sna/sna_dri2.c for a similar effect. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx