Re: [PATCH] drm/i915: Make sample_c messages go faster on Haswell.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday, October 30, 2014 01:01:30 PM Ville Syrjälä wrote:
> On Thu, Oct 30, 2014 at 02:32:40AM -0700, Kenneth Graunke wrote:
> > On Thursday, October 30, 2014 11:00:51 AM Ville Syrjälä wrote:
> > > On Thu, Oct 30, 2014 at 10:50:03AM +0200, Ville Syrjälä wrote:
> > > > On Wed, Oct 29, 2014 at 03:12:43PM -0700, Kenneth Graunke wrote:
> > > > > Haswell significantly improved the performance of sampler_c 
messages,
> > > > > but the optimization appears to be off by default.  Later platforms
> > > > > remove this bit, and apparently always enable the optimization.
> > > > > 
> > > > > Improves performance in "Counter Strike: Global Offensive" by 18%
> > > > > at default settings on Iris Pro.  No Piglit regressions.
> > > > 
> > > > Nice. We need more bits like this ;)
> > > > 
> > > > > 
> > > > > Signed-off-by: Kenneth Graunke <kenneth@xxxxxxxxxxxxx>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_reg.h | 1 +
> > > > >  drivers/gpu/drm/i915/intel_pm.c | 4 ++++
> > > > >  2 files changed, 5 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h 
> > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > index 77fce96..340821a 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > @@ -5952,6 +5952,7 @@ enum punit_power_well {
> > > > >  #define  HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE    (1 << 6)
> > > > >  
> > > > >  #define HALF_SLICE_CHICKEN3		0xe184
> > > > > +#define   HSW_SAMPLE_C_PERFORMANCE	(1<<9)
> > > > >  #define   GEN8_CENTROID_PIXEL_OPT_DIS	(1<<8)
> > > > >  #define   GEN8_SAMPLER_POWER_BYPASS_DIS	(1<<1)
> > > > >  
> > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c 
> > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > index 7a69eba..50c72a7 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > @@ -5736,6 +5736,10 @@ static void haswell_init_clock_gating(struct 
> > drm_device *dev)
> > > > >  	I915_WRITE(GEN7_GT_MODE,
> > > > >  		   GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4);
> > > > >  
> > > > > +	/* Make sample_c messages faster. */
> > > > 
> > > > I found a name for it in the w/a database.
> > > > 
> > > > WaSampleCChickenBitEnable:hsw
> > > > 
> > > > Reviewed-by: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx>
> > > 
> > > Oh actually it says palette won't work when this bit is on. I'm assuming
> > > that's the texture palette. Do we have any use of that anywhere?
> > 
> > That's a good point.  3DSTATE_SAMPLER_PALETTE_LOAD and the A8P8/indexed 
> > formats aren't used by Mesa or xf86-video-intel, but it looks like they 
might 
> > be used by libva.
> > 
> > Can someone confirm that libva does use the sampler palette?
> > 
> > If they do, what do we do about it?
> 
> I suppose the best option then would be to use an LRI from a batch,
> which means the register would need to be added to the cmd parser
> white list. This is one of the context saved registers so doing the
> LRI just once per context should be enough.

I don't like that solution.  For one, it's impossible - you can't LRI from 
userspace batches, even if you add it to the kernel command parser's 
whitelist, because the hardware scanner is still enabled.  Given that I've 
been waiting two years for this capability, I want to find a more immediate 
solution.

Another option is to have some sort of execbuf flag...maybe a 3D/Media "usage" 
flag.  If set to 3D, write 0x6000200...if media, write 0x6000000.  Or 
something specific.  I do hate adding more junk to the execbuf path, though.

Other ideas?

> Well that's assuming libva doesn't use the default context. I'm getting
> another itch to drop the restore inhibit flag for default contexts.
> That would actually make it possible to do these sort of things without
> risking breakage to existing userspace. But I think Chris is going
> scream unless the patch comes with performance data that shows it
> doesn't hurt too much.

I suppose it wouldn't affect Mesa much, since we never use the default context 
on Gen6+.  But otherwise I'd probably want to see the data, like Chris...

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux