Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL and TGL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2019-11-18 at 20:25 +0000, Lisovskiy, Stanislav wrote:

P.S: (In addition to my last yesterday letter):

> 
> That is actually a violation of BSpec, we are not using two slices
> for same
> pipe while we could. We had already enough of bw issues, why should
> we make our life even harder?
> 
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > 
> > > > > My eyes!
> > > > > 
> > > > > I have to think we should be able to reduce all that to a
> > > > > handful
> > > > > of lines of code.
> > > > 
> > > > Yeah, then we are going to have a huge function with lots of
> > > > weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > 
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > > 
> > > > Would you prefer to have it like
> > > > 
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > > 
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird
> > > > looking
> > > > if-else statements, GEN checks and so on.
> > > > 
> > > > I knew this table was not going to look like it is done
> > > > typically
> > > > here - but really my opinion that it is way better than
> > > > hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > > 
> > > > If you don't like it - could you please explain why this is
> > > > exactly
> > > > worse than having long functions with hardcoded checks?
> > > 
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify
> > > what
> > > is the distance of each pipe from each dbuf slice. Then all we
> > > have
> > > to do is minimize the total distance. The following tables I
> > > think
> > > should encode that reasonably concisely:
> > > 
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > > 
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> > 
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> > 
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> > A/S1 + B/S2 = 1 + 4 = 5
> > A/S2 + B/S1 = 2 + 1 = 3
> > but pipes A C get S1 and S2.
> > A/S1 + C/S2 = 1 + 1 = 2
> > A/S2 + C/S1 = 2 + 2 = 4
> > So it's exactly as my distance table says it should be.
> > 
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> > 
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> > 
> > > 
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > > 

One more thing I didn't mention yesterday - this metric matrix simply
reflects the affinity/cost of using different DBuf slices by different
Pipes. Sure it will help you to figure out which slice is preferrable
to use.
However you will still have to map it properly to overall active 
pipes mask, which will require you to account which slices are already
used, how many pipes can use a single slice and so on.

See:
I.e if you have lets say pipes A,B(just looked down to my table to know
the expected conf - so convenient), you will need to get S1, S2 for
pipes A, B respectively.
So from your distance metric table for ICL you will first figure out
that for pipe A you prefer S1, and for pipe B you prefer... also S1
as it has lower metric than S2, despite that you should have used S1
and S2 here according to BSpec Then you will have to write some
function that will iterate and count somehow remaining unused slices,
taking into consideration that despite S1 is closer to pipe B, you have
to choose S2 if there are two pipes.
And if there are 3 pipes - you have to... choose again S1 for pipe A,
S1 for pipe B(so you can't even mark S1 as used to simplify things) and
S2 for pipe C.

Interesting to see how you will going to encode this check. Sure I know
you will find some short way. But this will be for sure some runtime
cycle(s).

While I'm just getting this mask immediately in compile time.
While I can immediately compare it to BSpec.

But sure metric matrix sounds more fancy( I like this graph theory
stuff)

What are we arguing here about? Even if my code is not perfect here
my approach at least makes sense and could be even better than yours,
got r-b, BAT/IGT green. 

So Ville, what is the actual point of this kind of unexpected code
review slamming? Could be I just annoyed you too much asking questions
:)

Do we actually share opinions here or there is only one single correct
opinion and perfect approach?

There are warnings yep(just as in many other patches here) - sure will
fix.

Best Regards,

Lisovskiy Stanislav

> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> > 
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> > Well, I'd rather do that than decode through layers of cpp macros.
> > Also atm the table has no extra restrictions so it's clearly there
> > just because someone thought they'd write out the results of the
> > distance calculations. In general bspec style is somewhat annoying
> > where they try to write out things in the lowest level of
> > absraction
> > possible (they are hardware people after all). I'd much prefer a
> > higher level description of the issue.
> 
> I would prefer to see immediately what is different as this table
> makes it possible right away
> Also to me this table simply intuitively more understandable.
> 
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> 
> We have similar CDCLK table in intel_cdclk, while you could 
> also argue that we might had come up with some formula.
> 
> Could be we could make the table approach itself simpler - 
> that is quite possible, but implementing a distance algorithm
> seems to me to be a dangerous idea, despite that it of course
> looks better to me(yes it is).
> 
> Best Regards,
> 
> Lisovskiy Stanislav
> 
> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7,
> 02160 Espoo
> 
> ________________________________________
> From: Ville Syrjälä [ville.syrjala@xxxxxxxxxxxxxxx]
> Sent: Monday, November 18, 2019 8:18 PM
> To: Lisovskiy, Stanislav
> Cc: Saarinen, Jani; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Roper, Matthew
> D; maarten.lankhorst@xxxxxxxxxxxxxxx; Ausmus, James
> Subject: Re: [PATCH v7] drm/i915: Enable second dbuf slice for ICL
> and TGL
> 
> On Mon, Nov 18, 2019 at 05:17:51PM +0000, Lisovskiy, Stanislav wrote:
> > On Mon, 2019-11-18 at 17:27 +0200, Ville Syrjälä wrote:
> > > On Mon, Nov 18, 2019 at 09:19:18AM +0000, Lisovskiy, Stanislav
> > > wrote:
> > > > On Fri, 2019-11-15 at 22:19 +0200, Ville Syrjälä wrote:
> > > > > On Thu, Nov 14, 2019 at 02:24:49PM +0200, Stanislav Lisovskiy
> > > > > wrote:
> > > > > > Also implemented algorithm for choosing DBuf slice
> > > > > > configuration
> > > > > > based on active pipes, pipe ratio as stated in BSpec 12716.
> > > > > > 
> > > > > > Now pipe allocation still stays proportional to pipe width
> > > > > > as
> > > > > > before,
> > > > > > however within allowed DBuf slice for this particular
> > > > > > configuration.
> > > > > > 
> > > > > > v2: Remove unneeded check from commit as ddb enabled slices
> > > > > > might
> > > > > >     now differ from hw state.
> > > > > > 
> > > > > > v3: - Added new field "supported_slices" to ddb to separate
> > > > > > max
> > > > > >       supported slices vs currently enabled, to avoid
> > > > > > confusion.
> > > > > >     - Removed obsolete comments and code related to
> > > > > > DBuf(Matthew
> > > > > > Roper).
> > > > > >     - Some code style and long line removal(Matthew Roper).
> > > > > >     - Added WARN_ON to new ddb range offset calc
> > > > > > function(Matthew
> > > > > > Roper).
> > > > > >     - Removed platform specific call to calc pipe ratio
> > > > > > from
> > > > > > ddb
> > > > > >       allocation function and fixed the return
> > > > > > value(Matthew
> > > > > > Roper)
> > > > > >     - Refactored DBUF slice allocation table to improve
> > > > > > readability
> > > > > >     - Added DBUF slice allocation for TGL as well.
> > > > > >     - ICL(however not TGL) seems to voluntarily enable
> > > > > > second
> > > > > > DBuf
> > > > > > slice
> > > > > >       after pm suspend/resume causing a mismatch failure,
> > > > > > because
> > > > > > we
> > > > > >       update DBuf slices only if we do a modeset, however
> > > > > > this
> > > > > > check
> > > > > >       is done always. Fixed it to be done only when modeset
> > > > > > for
> > > > > > ICL.
> > > > > > 
> > > > > > v4: - Now enabled slices is not actually a number, but a
> > > > > > bitmask,
> > > > > >       because we might need to enable second slice only and
> > > > > > number
> > > > > >       of slices would still 1 and that current code doesn't
> > > > > > allow.
> > > > > >     - Remove redundant duplicate code to have some unified
> > > > > > way
> > > > > > of
> > > > > >       enabling dbuf slices instead of hardcoding.
> > > > > > 
> > > > > > v5: - Fix failing gen9_assert_dbuf_enabled as it was
> > > > > > naively
> > > > > > thinking
> > > > > >       that we have only one DBUF_CTL slice. Now another
> > > > > > version
> > > > > > for
> > > > > >       gen11 will check both slices as only second one can
> > > > > > be
> > > > > > enabled,
> > > > > >       to keep CI happy.
> > > > > > 
> > > > > > v6: - Removed enabled dbuf assertion completely. Previous
> > > > > > code
> > > > > >       was keeping dbuf enabled, even(!) when _dbuf_disable
> > > > > > is
> > > > > > called.
> > > > > >       Currently we enable/disable dbuf slices based on
> > > > > > requrement
> > > > > >       so no point in asserting this here.
> > > > > >     - Removed unnecessary modeset check from
> > > > > > verify_wm_state(Matthew Roper)
> > > > > >     - Slices intersection after union is same as final
> > > > > > result(Matthew Roper)
> > > > > >     - Moved DBuf slices to intel_device_info(Matthew Roper)
> > > > > >     - Some macros added(Matthew Roper)
> > > > > >     - ddb range is now always less or equal to ddb size -
> > > > > > no
> > > > > > need
> > > > > > for additional
> > > > > >       checks(previously needed as we had some bandwidth
> > > > > > checks
> > > > > > in
> > > > > > there which
> > > > > >       could "suddenly" return ddb_size smaller than it is.
> > > > > > 
> > > > > > v7: Minor costemic changes:
> > > > > >     - Changed icl_dbuf_slices_restore name to
> > > > > > icl_program_dbuf_slices
> > > > > >       as it more corresponds to the actual functionality.
> > > > > >     - Some simplification with supported slices for BXT and
> > > > > > GLK
> > > > > >     - skl_pipe_downscale_amount ->
> > > > > > icl_pipe_downscale_amount as
> > > > > >       this is not used for skl anymore.
> > > > > >     - Changed DBuf slice assignment order for TGL case
> > > > > > 
> > > > > > Reviewed-by: Matthew Roper <matthew.d.roper@xxxxxxxxx>
> > > > > > Signed-off-by: Stanislav Lisovskiy <
> > > > > > stanislav.lisovskiy@xxxxxxxxx>
> > > > > > Cc: Matthew Roper <matthew.d.roper@xxxxxxxxx>
> > > > > > Cc: Ville Syrjälä <ville.syrjala@xxxxxxxxx>
> > > > > > Cc: James Ausmus <james.ausmus@xxxxxxxxx>
> > > > > > ---
> > > > > >  drivers/gpu/drm/i915/display/intel_display.c  |  26 +-
> > > > > >  .../drm/i915/display/intel_display_power.c    |  98 ++---
> > > > > >  .../drm/i915/display/intel_display_power.h    |   2 +
> > > > > >  drivers/gpu/drm/i915/i915_drv.c               |   5 +
> > > > > >  drivers/gpu/drm/i915/i915_drv.h               |   2 +-
> > > > > >  drivers/gpu/drm/i915/i915_pci.c               |   6 +-
> > > > > >  drivers/gpu/drm/i915/i915_reg.h               |   8 +-
> > > > > >  drivers/gpu/drm/i915/intel_device_info.h      |   1 +
> > > > > >  drivers/gpu/drm/i915/intel_pm.c               | 387
> > > > > > ++++++++++++++++--
> > > > > >  9 files changed, 431 insertions(+), 104 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > index 876fc25968bf..bd7aff675198 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > > @@ -13387,7 +13387,7 @@ static void verify_wm_state(struct
> > > > > > intel_crtc *crtc,
> > > > > > 
> > > > > >       if (INTEL_GEN(dev_priv) >= 11 &&
> > > > > >           hw->ddb.enabled_slices != sw_ddb->enabled_slices)
> > > > > > -             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > > %u, got
> > > > > > %u)\n",
> > > > > > +             DRM_ERROR("mismatch in DBUF Slices (expected
> > > > > > %x, got
> > > > > > %x)\n",
> > > > > >                         sw_ddb->enabled_slices,
> > > > > >                         hw->ddb.enabled_slices);
> > > > > > 
> > > > > > @@ -14614,15 +14614,24 @@ static void
> > > > > > skl_commit_modeset_enables(struct intel_atomic_state
> > > > > > *state)
> > > > > >       u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > > 
> > > > > >       u8 required_slices = state-
> > > > > > > wm_results.ddb.enabled_slices;
> > > > > > 
> > > > > >       struct skl_ddb_entry entries[I915_MAX_PIPES] = {};
> > > > > > +     u8 slices_union = hw_enabled_slices |
> > > > > > required_slices;
> > > > > > +     u8 slices_intersection = required_slices;
> > > > > > 
> > > > > >       for_each_oldnew_intel_crtc_in_state(state, crtc,
> > > > > > old_crtc_state, new_crtc_state, i)
> > > > > >               /* ignore allocations for crtc's that have
> > > > > > been
> > > > > > turned
> > > > > > off. */
> > > > > >               if (new_crtc_state->hw.active)
> > > > > >                       entries[i] = old_crtc_state-
> > > > > > > wm.skl.ddb;
> > > > > > 
> > > > > > -     /* If 2nd DBuf slice required, enable it here */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices >
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     DRM_DEBUG_KMS("DBuf req slices %d hw slices %d\n",
> > > > > > +                   required_slices, hw_enabled_slices);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice required, enable it here,
> > > > > > +      * as here we only add more slices, but not remove to
> > > > > > prevent
> > > > > > +      * issues if somebody still uses those.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_union);
> > > > > > 
> > > > > >       /*
> > > > > >        * Whenever the number of active pipes changes, we
> > > > > > need
> > > > > > to make
> > > > > > sure we
> > > > > > @@ -14681,9 +14690,12 @@ static void
> > > > > > skl_commit_modeset_enables(struct intel_atomic_state
> > > > > > *state)
> > > > > >               }
> > > > > >       } while (progress);
> > > > > > 
> > > > > > -     /* If 2nd DBuf slice is no more required disable it
> > > > > > */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > > > > > +     /*
> > > > > > +      * If some other DBuf slice is not needed, disable it
> > > > > > here,
> > > > > > +      * as here we only remove slices, which are not
> > > > > > needed
> > > > > > anymore.
> > > > > > +      */
> > > > > > +     if (INTEL_GEN(dev_priv) >= 11 && required_slices !=
> > > > > > hw_enabled_slices)
> > > > > > +             icl_dbuf_slices_update(dev_priv,
> > > > > > slices_intersection);
> > > > > 
> > > > > This is still a totally wrong place for this. We need to be
> > > > > sure
> > > > > the
> > > > > planes have switched over to the new ddb layout before we
> > > > > power
> > > > > off
> > > > > anything.
> > > > 
> > > > I have just modified already existing line here, not added by
> > > > me.
> > > > The point of this patch is starting to use second DBuf slice
> > > > but
> > > > not fixing all kinds of issues related to DBuf code at the same
> > > > time.
> > > 
> > > You're re-enabling code I disabled on purpose. It has to be fixed
> > > properly or not touched at all. In hindsight I should have
> > > probably
> > > just ripped it all out to not give people bad ideas.
> > 
> > -     /* If 2nd DBuf slice is no more required disable it */
> > > > > > -     if (INTEL_GEN(dev_priv) >= 11 && required_slices <
> > > > > > hw_enabled_slices)
> > > > > > -             icl_dbuf_slices_update(dev_priv,
> > > > > > required_slices);
> > 
> > I'm not reenabling it - this is code is out there right now.
> 
> It's _dead_ code, and it's like that on purpose. You're now
> attempting
> to re-enable it without fixing the underlying issues.
> 
> > This is all again about changing all kinds of stuff at the same
> > time.
> > 
> > Why we can't just do things step by step? You have a point that
> > this is wrong place ok, but as we see current code causes no
> > regressions. Just tell me your idea where to put it and I will
> > do that, if you don't have it yet - let it stay there, if CI shows
> > no regressions currently.
> 
> The same global state tricks apply here as with the cdclk/sagv stuff.
> 1. enable new resources/bump up clocks
> 2. reconfigure planes/etc.
> 3. wait for vblanks
> 4. disable unused resources/drop the clocks
> 
> > 
> > > 
> > > > 
> > > > Of course this is nice idea to fix it - if that is wrong plane,
> > > > but current code somehow lives with it and probably this should
> > > > be
> > > > with some other patch.
> > > > 
> > > > > 
> > > > > >  }
> > > > > > 
> > > > > >  static void intel_atomic_helper_free_state(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > diff --git
> > > > > > a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > index ce1b64f4dd44..a78bc9b61862 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.c
> > > > > > @@ -1031,15 +1031,6 @@ static bool
> > > > > > gen9_dc_off_power_well_enabled(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >               (I915_READ(DC_STATE_EN) &
> > > > > > DC_STATE_EN_UPTO_DC5_DC6_MASK) == 0);
> > > > > >  }
> > > > > > 
> > > > > > -static void gen9_assert_dbuf_enabled(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > -{
> > > > > > -     u32 tmp = I915_READ(DBUF_CTL);
> > > > > > -
> > > > > > -     WARN((tmp & (DBUF_POWER_STATE | DBUF_POWER_REQUEST))
> > > > > > !=
> > > > > > -          (DBUF_POWER_STATE | DBUF_POWER_REQUEST),
> > > > > > -          "Unexpected DBuf power power state (0x%08x)\n",
> > > > > > tmp);
> > > > > > -}
> > > > > > -
> > > > > >  static void gen9_disable_dc_states(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > >       struct intel_cdclk_state cdclk_state = {};
> > > > > > @@ -1055,8 +1046,6 @@ static void
> > > > > > gen9_disable_dc_states(struct
> > > > > > drm_i915_private *dev_priv)
> > > > > >       /* Can't read out voltage_level so can't use
> > > > > > intel_cdclk_changed() */
> > > > > >       WARN_ON(intel_cdclk_needs_modeset(&dev_priv-
> > > > > > >cdclk.hw,
> > > > > > &cdclk_state));
> > > > > > 
> > > > > > -     gen9_assert_dbuf_enabled(dev_priv);
> > > > > > -
> > > > > >       if (IS_GEN9_LP(dev_priv))
> > > > > >               bxt_verify_ddi_phy_power_wells(dev_priv);
> > > > > > 
> > > > > > @@ -4254,31 +4243,51 @@ static void
> > > > > > gen9_dbuf_disable(struct
> > > > > > drm_i915_private *dev_priv)
> > > > > >       intel_dbuf_slice_set(dev_priv, DBUF_CTL, false);
> > > > > >  }
> > > > > > 
> > > > > > -static u8 intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > +int intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     if (INTEL_GEN(dev_priv) < 11)
> > > > > > -             return 1;
> > > > > > -     return 2;
> > > > > > +     return INTEL_INFO(dev_priv)->supported_slices;
> > > > > > +}
> > > > > 
> > > > > Kinda pointless function now. 'supported_slices' is not a
> > > > > very
> > > > > good
> > > > > name. Should be 'num_dbuf_slices' or something along those
> > > > > lines.
> > > > 
> > > > Supported slices really means how much slices we support in
> > > > hardware.
> > > > num_dbuf_slices will not reflect that, it will be more
> > > > ambigious.
> > > > I don't like this name.
> > > 
> > > supported_slices doesn't even say "dbuf" anywhere. So it's even
> > > more
> > > ambiguous. Also we already have num_pipes, num_sprites, etc.
> > 
> > ..and in current code we have... "enabled_slices", no? Ouch! :) IS
> > this
> > really that important and worth arguing at the very last moment?
> > whether number of
> > supported_slices_in_hardware, should be named "supported slices" or
> > "num_dbuf_slices"? If yes, lets go on, miss some targets and
> > continue
> > discussing this "soooo important" thing.
> > 
> > Some one might come up that it is called "num_slices" or
> > "dbuf_num_slices". Seriously, may be "supported_num_dbuf_slices" to
> > keep everyones bloated egos happy??
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > +{
> > > > > > +     const u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > 
> > > > > Seems a rather pointless variable. const at least is
> > > > > pointless.
> > > > > 
> > > > > > +
> > > > > > +     icl_dbuf_slices_update(dev_priv, hw_enabled_slices);
> > > > > >  }
> > > > > > 
> > > > > >  void icl_dbuf_slices_update(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                           u8 req_slices)
> > > > > >  {
> > > > > > -     const u8 hw_enabled_slices = dev_priv-
> > > > > > > wm.skl_hw.ddb.enabled_slices;
> > > > > > 
> > > > > > -     bool ret;
> > > > > > +     bool ret = true;
> > > > > > +     int i;
> > > > > > +     int max_slices = intel_dbuf_max_slices(dev_priv);
> > > > > > 
> > > > > > -     if (req_slices > intel_dbuf_max_slices(dev_priv)) {
> > > > > > +     if (hweight8(req_slices) >
> > > > > > intel_dbuf_max_slices(dev_priv)) {
> > > > > >               DRM_ERROR("Invalid number of dbuf slices
> > > > > > requested\n");
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > -     if (req_slices == hw_enabled_slices || req_slices ==
> > > > > > 0)
> > > > > > -             return;
> > > > > > +     DRM_DEBUG_KMS("Updating dbuf slices to %x\n",
> > > > > > req_slices);
> > > > > > 
> > > > > > -     if (req_slices > hw_enabled_slices)
> > > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > > DBUF_CTL_S2,
> > > > > > true);
> > > > > > -     else
> > > > > > -             ret = intel_dbuf_slice_set(dev_priv,
> > > > > > DBUF_CTL_S2,
> > > > > > false);
> > > > > > +     for (i = 0; i < max_slices; i++) {
> > > > > > +             int slice_bit = BIT(i);
> > > > > > +             bool slice_set = (slice_bit & req_slices) !=
> > > > > > 0;
> > > > > > +
> > > > > > +             switch (slice_bit) {
> > > > > > +             case DBUF_S1_BIT:
> > > > > > +                     ret = ret &&
> > > > > > intel_dbuf_slice_set(dev_priv,
> > > > > > +                                                       DBU
> > > > > > F_
> > > > > > CTL_S1,
> > > > > > +                                                       sli
> > > > > > ce
> > > > > > _set);
> > > > > > +                     break;
> > > > > > +             case DBUF_S2_BIT:
> > > > > > +                     ret = ret &&
> > > > > > intel_dbuf_slice_set(dev_priv,
> > > > > > +                                                       DBU
> > > > > > F_
> > > > > > CTL_S2,
> > > > > > +                                                       sli
> > > > > > ce
> > > > > > _set);
> > > > > 
> > > > > The return value on intel_dbuf_slice_set() seems pointless.
> > > > > I'd
> > > > > nuke
> > > > > it.
> > > > > 
> > > > > Also as long you have intel_dbuf_slice_set() I'd probably
> > > > > just
> > > > > pass
> > > > > the
> > > > > slice index there and let it deal with the register details.
> > > > > DBUF_CTL
> > > > > should probably be parametrized as well so you don't need
> > > > > annoying
> > > > > switch
> > > > > statements.
> > > > 
> > > > What if it fails? Shouldn't we still have some clue what state
> > > > do
> > > > we
> > > > have/had at the moment?
> > > 
> > > It if fails you log an error and keep going.
> > 
> > Ah sure. And let hw/sw state be completely different.
> > Also see next comment about this.
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +                     break;
> > > > > > +             default:
> > > > > > +                     MISSING_CASE(slice_bit);
> > > > > > +             }
> > > > > > +     }
> > > > > > 
> > > > > >       if (ret)
> > > > > >               dev_priv->wm.skl_hw.ddb.enabled_slices =
> > > > > > req_slices;
> > > > > > @@ -4286,40 +4295,21 @@ void icl_dbuf_slices_update(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > > 
> > > > > >  static void icl_dbuf_enable(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) |
> > > > > > DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if (!(I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         !(I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power enable timeout\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that we only have 1
> > > > > > slice,
> > > > > > see
> > > > > > -              * intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Just power up 1 slice, we will
> > > > > > +      * figure out later which slices we have and what we
> > > > > > need.
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = DBUF_S1_BIT;
> > > > > 
> > > > > That stuff really shouldn't be modified by low level
> > > > > functions.
> > > > > The intial value should come from readout, and after that it
> > > > > should
> > > > > be adjusted as part of the overall atomic state.
> > > > 
> > > > Well, as you can see currently this is even worse - has some
> > > > weird
> > > > hardcoded stuff and is modified like that. Also if we are
> > > > enabling
> > > > the
> > > > dbuf slices, shouldn't we really determine what value should be
> > > > there?
> > > > 
> > > > Also I wouldn't really always trust the readout, as to me it
> > > > seems
> > > > icelake resets it constantly to 2 after suspend/resume.
> > > 
> > > The point of readout is that we syncrhonize our software state
> > > with
> > > the
> > > current state of hardware without having to modify the hardware
> > > state.
> > > Once synchronized the driver takes over and tells the hw what to
> > > do.
> > 
> > That is especially nice after you say "report error and don't care"
> > when I really want to reflect hw state in sw state, by checking
> > whether
> > we could set dbuf slices or not.
> > Now you say that we need to synchronize software state with hw
> > state,
> > however if I fail to modify it - then "who cares".
> 
> Yes. If you fail to modify the hardware state the whole thing is
> broken anyway. There's is *nothing* we can do at this stage in the
> commit to fix it.
> 
> > 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > > >  }
> > > > > > 
> > > > > >  static void icl_dbuf_disable(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     I915_WRITE(DBUF_CTL_S1, I915_READ(DBUF_CTL_S1) &
> > > > > > ~DBUF_POWER_REQUEST);
> > > > > > -     I915_WRITE(DBUF_CTL_S2, I915_READ(DBUF_CTL_S2) &
> > > > > > ~DBUF_POWER_REQUEST);
> > > > > > -     POSTING_READ(DBUF_CTL_S2);
> > > > > > -
> > > > > > -     udelay(10);
> > > > > > -
> > > > > > -     if ((I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE) ||
> > > > > > -         (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE))
> > > > > > -             DRM_ERROR("DBuf power disable timeout!\n");
> > > > > > -     else
> > > > > > -             /*
> > > > > > -              * FIXME: for now pretend that the first
> > > > > > slice
> > > > > > is
> > > > > > always
> > > > > > -              * enabled, see
> > > > > > intel_enabled_dbuf_slices_num().
> > > > > > -              */
> > > > > > -             dev_priv->wm.skl_hw.ddb.enabled_slices = 1;
> > > > > > +     /*
> > > > > > +      * Disable all slices
> > > > > > +      */
> > > > > > +     dev_priv->wm.skl_hw.ddb.enabled_slices = 0;
> > > > > > +     icl_program_dbuf_slices(dev_priv);
> > > > > >  }
> > > > > > 
> > > > > >  static void icl_mbus_init(struct drm_i915_private
> > > > > > *dev_priv)
> > > > > > diff --git
> > > > > > a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > index 1da04f3e0fb3..b7057d68ad78 100644
> > > > > > --- a/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > +++ b/drivers/gpu/drm/i915/display/intel_display_power.h
> > > > > > @@ -311,8 +311,10 @@ intel_display_power_put_async(struct
> > > > > > drm_i915_private *i915,
> > > > > >       for ((wf) = intel_display_power_get((i915),
> > > > > > (domain));
> > > > > > (wf); \
> > > > > >            intel_display_power_put_async((i915), (domain),
> > > > > > (wf)),
> > > > > > (wf) = 0)
> > > > > > 
> > > > > > +int intel_dbuf_max_slices(struct drm_i915_private
> > > > > > *dev_priv);
> > > > > >  void icl_dbuf_slices_update(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                           u8 req_slices);
> > > > > > +void icl_program_dbuf_slices(struct drm_i915_private
> > > > > > *dev_priv);
> > > > > > 
> > > > > >  void chv_phy_powergate_lanes(struct intel_encoder
> > > > > > *encoder,
> > > > > >                            bool override, unsigned int
> > > > > > mask);
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c
> > > > > > b/drivers/gpu/drm/i915/i915_drv.c
> > > > > > index 82e4e6bf08c3..96741e68633a 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > > > > > @@ -51,6 +51,7 @@
> > > > > >  #include "display/intel_bw.h"
> > > > > >  #include "display/intel_cdclk.h"
> > > > > >  #include "display/intel_display_types.h"
> > > > > > +#include "display/intel_display_power.h"
> > > > > >  #include "display/intel_dp.h"
> > > > > >  #include "display/intel_fbdev.h"
> > > > > >  #include "display/intel_hotplug.h"
> > > > > > @@ -2588,6 +2589,10 @@ static int
> > > > > > intel_runtime_resume(struct
> > > > > > device *kdev)
> > > > > >       if (IS_VALLEYVIEW(dev_priv) ||
> > > > > > IS_CHERRYVIEW(dev_priv))
> > > > > >               ret = vlv_resume_prepare(dev_priv, true);
> > > > > > 
> > > > > > +     /* Weird hack to fix ICL hardware bug, as it resets
> > > > > > DBUF slices
> > > > > > reg */
> > > > > > +     if (INTEL_GEN(dev_priv) == 11)
> > > > > > +             icl_program_dbuf_slices(dev_priv);
> > > > > > +
> > > > > >       intel_uncore_runtime_resume(&dev_priv->uncore);
> > > > > > 
> > > > > >       intel_runtime_pm_enable_interrupts(dev_priv);
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > index 7e0f67babe20..a396977c9c2d 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > > > @@ -804,7 +804,7 @@ static inline bool
> > > > > > skl_ddb_entry_equal(const
> > > > > > struct skl_ddb_entry *e1,
> > > > > >  }
> > > > > > 
> > > > > >  struct skl_ddb_allocation {
> > > > > 
> > > > > This struct is quite pointless. Should be killed off.
> > > > 
> > > > Legacy stuff, not added by me. Quite fine with removing it -
> > > > but should we really do all this at the same time and in same
> > > > patch?
> > > 
> > > Not the same patch. But clean up the cobwebs first, then add the
> > > new
> > > stuff. Much easier that way around.
> > 
> > There is a shit ton load of cobwebs to clean. Personally my opinion
> > is
> > that current dbuf controlling code is total garbage - which has
> > lots of
> > hardcoded stuff. Good luck enabling only second dbuf slice, good
> > luck
> > disabling/enabling dbuf properly....
> > Where you were, when this was committed then?
> 
> Looking at something else unfortunately. Also it has just kept
> bitrotting since because no one has cared.
> 
> > 
> > If we need to fix it - fine. But may be we should do it in stages?
> > Point of this patch is "enable second dbuf slice", next patch
> > "remove
> > unneeded structs" and so on.
> > While you want me to shuffle everything at the same time, which is
> > not
> > good idea, especially considering complexity and amount of legacy
> > stuff
> > we currently have.
> 
> Building new stuff on top of a decomposing garbage pile is not
> a good idea: a) it's harder than building on clean foundation,
> b) it'll probably fall over and get buried in the old garbage
> at which point you're likely going to be right back where you
> started.
> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > -     u8 enabled_slices; /* GEN11 has configurable 2 slices
> > > > > > */
> > > > > > +     u8 enabled_slices;   /* Bitmask of currently enabled
> > > > > > DBuf
> > > > > > slices */
> > > > > >  };
> > > > > > 
> > > > > >  struct skl_ddb_values {
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > > > > > b/drivers/gpu/drm/i915/i915_pci.c
> > > > > > index 1bb701d32a5d..8e86af505730 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_pci.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > > > > > @@ -614,7 +614,8 @@ static const struct intel_device_info
> > > > > > intel_cherryview_info = {
> > > > > >       .has_gt_uc = 1, \
> > > > > >       .display.has_hdcp = 1, \
> > > > > >       .display.has_ipc = 1, \
> > > > > > -     .ddb_size = 896
> > > > > > +     .ddb_size = 896, \
> > > > > > +     .supported_slices = 1
> > > > > > 
> > > > > >  #define SKL_PLATFORM \
> > > > > >       GEN9_FEATURES, \
> > > > > > @@ -650,6 +651,7 @@ static const struct intel_device_info
> > > > > > intel_skylake_gt4_info = {
> > > > > >       GEN(9), \
> > > > > >       .is_lp = 1, \
> > > > > >       .display.has_hotplug = 1, \
> > > > > > +     .supported_slices = 1, \
> > > > > >       .engine_mask = BIT(RCS0) | BIT(VCS0) | BIT(BCS0) |
> > > > > > BIT(VECS0),
> > > > > > \
> > > > > >       .pipe_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C),
> > > > > > \
> > > > > >       .has_64bit_reloc = 1, \
> > > > > > @@ -737,6 +739,7 @@ static const struct intel_device_info
> > > > > > intel_coffeelake_gt3_info = {
> > > > > >       GEN9_FEATURES, \
> > > > > >       GEN(10), \
> > > > > >       .ddb_size = 1024, \
> > > > > > +     .supported_slices = 1, \
> > > > > >       .display.has_dsc = 1, \
> > > > > >       .has_coherent_ggtt = false, \
> > > > > >       GLK_COLORS
> > > > > > @@ -773,6 +776,7 @@ static const struct intel_device_info
> > > > > > intel_cannonlake_info = {
> > > > > >       }, \
> > > > > >       GEN(11), \
> > > > > >       .ddb_size = 2048, \
> > > > > > +     .supported_slices = 2, \
> > > > > >       .has_logical_ring_elsq = 1, \
> > > > > >       .color = { .degamma_lut_size = 33, .gamma_lut_size =
> > > > > > 262145 }
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_reg.h
> > > > > > b/drivers/gpu/drm/i915/i915_reg.h
> > > > > > index a607ea520829..fba5731063d8 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_reg.h
> > > > > > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > > > > > @@ -7734,9 +7734,11 @@ enum {
> > > > > >  #define DISP_ARB_CTL2        _MMIO(0x45004)
> > > > > >  #define  DISP_DATA_PARTITION_5_6     (1 << 6)
> > > > > >  #define  DISP_IPC_ENABLE             (1 << 3)
> > > > > > -#define DBUF_CTL     _MMIO(0x45008)
> > > > > > -#define DBUF_CTL_S1  _MMIO(0x45008)
> > > > > > -#define DBUF_CTL_S2  _MMIO(0x44FE8)
> > > > > > +#define DBUF_S1_BIT                  BIT(0)
> > > > > > +#define DBUF_S2_BIT                  BIT(1)
> > > > > 
> > > > > These are not bits inside DBUF_CTL so they should not be
> > > > > here.
> > > > > I'd just nuke them entirely. If you think we need more than
> > > > > an
> > > > > int to refer to a slice then IMO something simple like
> > > > > 
> > > > > enum dbuf_slice {
> > > > >         DBUF_S1,
> > > > >         DBUF_S2,
> > > > > };
> > > > > 
> > > > > would match the general apporach we use for other things much
> > > > > better
> > > > > rather than defining some shifted values.
> > > > 
> > > > We need to combine slices - so bitmask is quite suitable for
> > > > that,
> > > > I can of course use this enum, but then I'll have to re-encode
> > > > it
> > > > to bitmask anyways.
> > > 
> > > Which is pretty much how everything else works in the driver.
> > 
> > Another piece of incredibly important thing to argue..
> 
> It is. Lot's of crap in the driver is simply caused by people not
> following established style and coming up with something different..
> Which means there is an extra cost for everyone else when reading the
> code. And that probably means most people just don't bother to read
> it.
> 
> Granted, some of that is due to spurts in evolution where major
> overhauls
> changed the architecture so significantly that the old established
> conventions simply made node sense anymore. Atomic being the last big
> one in that regard. And we're still cleaning up the resulting mess.
> 
> > 
> > > 
> > > > 
> > > > 
> > > > > 
> > > > > > +#define DBUF_CTL                     _MMIO(0x45008)
> > > > > > +#define DBUF_CTL_S1                  _MMIO(0x45008)
> > > > > > +#define DBUF_CTL_S2                  _MMIO(0x44FE8)
> > > > > >  #define  DBUF_POWER_REQUEST          (1 << 31)
> > > > > >  #define  DBUF_POWER_STATE            (1 << 30)
> > > > > >  #define GEN7_MSG_CTL _MMIO(0x45010)
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > index 4bdf8a6cfb47..ba34e1a5c591 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > +++ b/drivers/gpu/drm/i915/intel_device_info.h
> > > > > > @@ -180,6 +180,7 @@ struct intel_device_info {
> > > > > >       } display;
> > > > > > 
> > > > > >       u16 ddb_size; /* in blocks */
> > > > > > +     u8 supported_slices; /* number of DBuf slices */
> > > > > > 
> > > > > >       /* Register offsets for the various display pipes and
> > > > > > transcoders */
> > > > > >       int pipe_offsets[I915_MAX_TRANSCODERS];
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_pm.c
> > > > > > b/drivers/gpu/drm/i915/intel_pm.c
> > > > > > index 2d389e437e87..1ea81ab92429 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_pm.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > > > > > @@ -3586,24 +3586,20 @@ bool ilk_disable_lp_wm(struct
> > > > > > drm_device
> > > > > > *dev)
> > > > > >       return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
> > > > > >  }
> > > > > > 
> > > > > > -static u8 intel_enabled_dbuf_slices_num(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > > +static u8 intel_enabled_dbuf_slices(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv)
> > > > > >  {
> > > > > > -     u8 enabled_slices;
> > > > > > -
> > > > > > -     /* Slice 1 will always be enabled */
> > > > > > -     enabled_slices = 1;
> > > > > > +     u8 enabled_slices = 0;
> > > > > > 
> > > > > >       /* Gen prior to GEN11 have only one DBuf slice */
> > > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > > -             return enabled_slices;
> > > > > > +             return DBUF_S1_BIT;
> > > > > 
> > > > > I'd nuke this special case entirely and just read everything
> > > > > from
> > > > > the
> > > > > hw. Assuming we need readout somewhere.
> > > > 
> > > > We have it that way hardcoded right now in drm-tip. I think
> > > > this is
> > > > a
> > > > subject to other patch, but not this change. As I understand we
> > > > need
> > > > to first figure out how to do it properly.
> > > > This patch anyway makes it better by at least removing
> > > > hardcoded
> > > > relation between number of slices and which slices are enabled.
> > > > With current code there is no way for example to enable second
> > > > DBuf
> > > > slice only, which is required by BSpec.
> > > > 
> > > > > 
> > > > > > 
> > > > > > -     /*
> > > > > > -      * FIXME: for now we'll only ever use 1 slice;
> > > > > > pretend
> > > > > > that we
> > > > > > have
> > > > > > -      * only that 1 slice enabled until we have a proper
> > > > > > way
> > > > > > for on-
> > > > > > demand
> > > > > > -      * toggling of the second slice.
> > > > > > -      */
> > > > > > -     if (0 && I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > > -             enabled_slices++;
> > > > > > +     /* Check if second DBuf slice is enabled */
> > > > > > +     if (I915_READ(DBUF_CTL_S1) & DBUF_POWER_STATE)
> > > > > > +             enabled_slices |= DBUF_S1_BIT;
> > > > > > +
> > > > > > +     if (I915_READ(DBUF_CTL_S2) & DBUF_POWER_STATE)
> > > > > > +             enabled_slices |= DBUF_S2_BIT;
> > > > > > 
> > > > > >       return enabled_slices;
> > > > > >  }
> > > > > > @@ -3812,36 +3808,38 @@ static u16
> > > > > > intel_get_ddb_size(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >                             const int num_active,
> > > > > >                             struct skl_ddb_allocation *ddb)
> > > > > >  {
> > > > > > -     const struct drm_display_mode *adjusted_mode;
> > > > > > -     u64 total_data_bw;
> > > > > >       u16 ddb_size = INTEL_INFO(dev_priv)->ddb_size;
> > > > > > -
> > > > > >       WARN_ON(ddb_size == 0);
> > > > > > 
> > > > > >       if (INTEL_GEN(dev_priv) < 11)
> > > > > >               return ddb_size - 4; /* 4 blocks for bypass
> > > > > > path
> > > > > > allocation */
> > > > > > 
> > > > > > -     adjusted_mode = &crtc_state->hw.adjusted_mode;
> > > > > > -     total_data_bw = total_data_rate *
> > > > > > drm_mode_vrefresh(adjusted_mode);
> > > > > > +     return ddb_size;
> > > > > > +}
> > > > > > 
> > > > > > -     /*
> > > > > > -      * 12GB/s is maximum BW supported by single DBuf
> > > > > > slice.
> > > > > > -      *
> > > > > > -      * FIXME dbuf slice code is broken:
> > > > > > -      * - must wait for planes to stop using the slice
> > > > > > before
> > > > > > powering it off
> > > > > > -      * - plane straddling both slices is illegal in
> > > > > > multi-
> > > > > > pipe
> > > > > > scenarios
> > > > > > -      * - should validate we stay within the hw bandwidth
> > > > > > limits
> > > > > > -      */
> > > > > > -     if (0 && (num_active > 1 || total_data_bw >=
> > > > > > GBps(12)))
> > > > > > {
> > > > > > -             ddb->enabled_slices = 2;
> > > > > > -     } else {
> > > > > > -             ddb->enabled_slices = 1;
> > > > > > -             ddb_size /= 2;
> > > > > > -     }
> > > > > > +/*
> > > > > > + * Calculate initial DBuf slice offset, based on slice
> > > > > > size
> > > > > > + * and mask(i.e if slice size is 1024 and second slice is
> > > > > > enabled
> > > > > > + * offset would be 1024)
> > > > > > + */
> > > > > > +static u32 skl_get_first_dbuf_slice_offset(u32
> > > > > > dbuf_slice_mask,
> > > > > > +                                        u32 slice_size,
> > > > > > u32
> > > > > > ddb_size)
> > > > > > +{
> > > > > > +     u32 offset = 0;
> > > > > > 
> > > > > > -     return ddb_size;
> > > > > > +     if (!dbuf_slice_mask)
> > > > > > +             return 0;
> > > > > > +
> > > > > > +     offset = (ffs(dbuf_slice_mask) - 1) * slice_size;
> > > > > > +
> > > > > > +     WARN_ON(offset >= ddb_size);
> > > > > > +     return offset;
> > > > > >  }
> > > > > > 
> > > > > > +static u32 i915_get_allowed_dbuf_mask(struct
> > > > > > drm_i915_private
> > > > > > *dev_priv,
> > > > > > +                                   int pipe, u32
> > > > > > active_pipes,
> > > > > > +                                   const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state);
> > > > > > +
> > > > > >  static void
> > > > > >  skl_ddb_get_pipe_allocation_limits(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state,
> > > > > > @@ -3857,7 +3855,14 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >       u32 pipe_width = 0, total_width = 0,
> > > > > > width_before_pipe
> > > > > > = 0;
> > > > > >       enum pipe for_pipe = to_intel_crtc(for_crtc)->pipe;
> > > > > >       u16 ddb_size;
> > > > > > +     u32 ddb_range_size;
> > > > > >       u32 i;
> > > > > > +     u32 dbuf_slice_mask;
> > > > > > +     u32 active_pipes;
> > > > > > +     u32 offset;
> > > > > > +     u32 slice_size;
> > > > > > +     u32 total_slice_mask;
> > > > > > +     u32 start, end;
> > > > > > 
> > > > > >       if (WARN_ON(!state) || !crtc_state->hw.active) {
> > > > > >               alloc->start = 0;
> > > > > > @@ -3866,14 +3871,23 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > -     if (intel_state->active_pipe_changes)
> > > > > > +     if (intel_state->active_pipe_changes) {
> > > > > >               *num_active = hweight8(intel_state-
> > > > > > > active_pipes);
> > > > > > 
> > > > > > -     else
> > > > > > +             active_pipes = intel_state->active_pipes;
> > > > > > +     } else {
> > > > > >               *num_active = hweight8(dev_priv-
> > > > > > >active_pipes);
> > > > > > +             active_pipes = dev_priv->active_pipes;
> > > > > > +     }
> > > > > > 
> > > > > >       ddb_size = intel_get_ddb_size(dev_priv, crtc_state,
> > > > > > total_data_rate,
> > > > > >                                     *num_active, ddb);
> > > > > > 
> > > > > > +     DRM_DEBUG_KMS("Got total ddb size %d\n", ddb_size);
> > > > > > +
> > > > > > +     slice_size = ddb_size / INTEL_INFO(dev_priv)-
> > > > > > > supported_slices;
> > > > > > 
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Got DBuf slice size %d\n",
> > > > > > slice_size);
> > > > > > +
> > > > > >       /*
> > > > > >        * If the state doesn't change the active CRTC's or
> > > > > > there is no
> > > > > >        * modeset request, then there's no need to
> > > > > > recalculate;
> > > > > > @@ -3891,20 +3905,70 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >               return;
> > > > > >       }
> > > > > > 
> > > > > > +     /*
> > > > > > +      * Get allowed DBuf slices for correspondent pipe and
> > > > > > platform.
> > > > > > +      */
> > > > > > +     dbuf_slice_mask =
> > > > > > i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > for_pipe,
> > > > > > +                                                  active_p
> > > > > > ip
> > > > > > es,
> > > > > > crtc_state);
> > > > > 
> > > > > Quite a few redundant arguments. The function name looks
> > > > > alien.
> > > > > 
> > > > > skl_crtc_possible_dbuf_slices() or something perhaps?
> > > > > 
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("DBuf slice mask %x pipe %d active
> > > > > > pipes
> > > > > > %x\n",
> > > > > > +                   dbuf_slice_mask,
> > > > > > +                   for_pipe, active_pipes);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Figure out at which DBuf slice we start, i.e if we
> > > > > > start at
> > > > > > Dbuf S2
> > > > > > +      * and slice size is 1024, the offset would be 1024
> > > > > > +      */
> > > > > > +     offset =
> > > > > > skl_get_first_dbuf_slice_offset(dbuf_slice_mask,
> > > > > > +                                              slice_size,
> > > > > > ddb_size);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Figure out total size of allowed DBuf slices,
> > > > > > which
> > > > > > is
> > > > > > basically
> > > > > > +      * a number of allowed slices for that pipe
> > > > > > multiplied
> > > > > > by slice
> > > > > > size.
> > > > > > +      * Inside of this
> > > > > > +      * range ddb entries are still allocated in
> > > > > > proportion
> > > > > > to
> > > > > > display width.
> > > > > > +      */
> > > > > > +     ddb_range_size = hweight8(dbuf_slice_mask) *
> > > > > > slice_size;
> > > > > > +
> > > > > >       /*
> > > > > >        * Watermark/ddb requirement highly depends upon
> > > > > > width
> > > > > > of the
> > > > > >        * framebuffer, So instead of allocating DDB equally
> > > > > > among
> > > > > > pipes
> > > > > >        * distribute DDB based on resolution/width of the
> > > > > > display.
> > > > > >        */
> > > > > > +     total_slice_mask = dbuf_slice_mask;
> > > > > 
> > > > > Isn't this already taken care by the loop below?
> > > > > 
> > > > > >       for_each_new_intel_crtc_in_state(intel_state, crtc,
> > > > > > crtc_state,
> > > > > > i) {
> > > > > >               const struct drm_display_mode *adjusted_mode
> > > > > > =
> > > > > >                       &crtc_state->hw.adjusted_mode;
> > > > > >               enum pipe pipe = crtc->pipe;
> > > > > >               int hdisplay, vdisplay;
> > > > > > +             u32 pipe_dbuf_slice_mask = \
> > > > > > +                     i915_get_allowed_dbuf_mask(dev_priv,
> > > > > > +                                             pipe,
> > > > > > +                                             active_pipes,
> > > > > > +                                             crtc_state);
> > > > > > 
> > > > > >               if (!crtc_state->hw.enable)
> > > > > >                       continue;
> > > > > > 
> > > > > > +             /*
> > > > > > +              * According to BSpec pipe can share one dbuf
> > > > > > slice
> > > > > > with another
> > > > > > +              * pipes or pipe can use multiple dbufs, in
> > > > > > both cases
> > > > > > we
> > > > > > +              * account for other pipes only if they have
> > > > > > exactly
> > > > > > same mask.
> > > > > > +              * However we need to account how many slices
> > > > > > we should
> > > > > > enable
> > > > > > +              * in total.
> > > > > > +              */
> > > > > > +             total_slice_mask |= pipe_dbuf_slice_mask;
> > > > > 
> > > > > total_slice_mask will now account only the crtcs in the
> > > > > state.
> > > > > What
> > > > > happens to the other pipes' slices?
> > 
> > Same as happens now btw. Right now it also does the same thing.
> 
> Right now it does absolutely nothing. It's all dead code.
> 
> > 
> > > > 
> > > > Yes we need to iterate through all pipes/crtcs here, however
> > > > for getting correct dbuf slice mask most important is
> > > > active_pipes.
> > > > However yes, I will change the loop for it to be more correct.
> > > > 
> > > > > 
> > > > > > +
> > > > > > +             /*
> > > > > > +              * Do not account pipes using other slice
> > > > > > sets
> > > > > > +              * luckily as of current BSpec slice sets do
> > > > > > not
> > > > > > partially
> > > > > > +              * intersect(pipes share either same one
> > > > > > slice
> > > > > > or same
> > > > > > slice set
> > > > > > +              * i.e no partial intersection), so it is
> > > > > > enough to
> > > > > > check for
> > > > > > +              * equality for now.
> > > > > > +              */
> > > > > > +             if (dbuf_slice_mask != pipe_dbuf_slice_mask)
> > > > > > +                     continue;
> > > > > > +
> > > > > >               drm_mode_get_hv_timing(adjusted_mode,
> > > > > > &hdisplay,
> > > > > > &vdisplay);
> > > > > >               total_width += hdisplay;
> > > > > > 
> > > > > > @@ -3914,8 +3978,19 @@
> > > > > > skl_ddb_get_pipe_allocation_limits(struct
> > > > > > drm_i915_private *dev_priv,
> > > > > >                       pipe_width = hdisplay;
> > > > > >       }
> > > > > > 
> > > > > > -     alloc->start = ddb_size * width_before_pipe /
> > > > > > total_width;
> > > > > > -     alloc->end = ddb_size * (width_before_pipe +
> > > > > > pipe_width) /
> > > > > > total_width;
> > > > > > +     ddb->enabled_slices = total_slice_mask;
> > > > > > +
> > > > > > +     start = ddb_range_size * width_before_pipe /
> > > > > > total_width;
> > > > > > +     end = ddb_range_size * (width_before_pipe +
> > > > > > pipe_width)
> > > > > > /
> > > > > > total_width;
> > > > > > +
> > > > > > +     alloc->start = offset + start;
> > > > > > +     alloc->end = offset + end;
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ddb %d-%d\n", for_pipe,
> > > > > > +                   alloc->start, alloc->end);
> > > > > > +     DRM_DEBUG_KMS("Enabled ddb slices mask %x num
> > > > > > supported
> > > > > > %d\n",
> > > > > > +                   ddb->enabled_slices,
> > > > > > +                   INTEL_INFO(dev_priv)-
> > > > > > >supported_slices);
> > > > > >  }
> > > > > > 
> > > > > >  static int skl_compute_wm_params(const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state,
> > > > > > @@ -4036,7 +4111,8 @@ void skl_pipe_ddb_get_hw_state(struct
> > > > > > intel_crtc *crtc,
> > > > > >  void skl_ddb_get_hw_state(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > >                         struct skl_ddb_allocation *ddb /*
> > > > > > out
> > > > > > */)
> > > > > >  {
> > > > > > -     ddb->enabled_slices =
> > > > > > intel_enabled_dbuf_slices_num(dev_priv);
> > > > > > +     ddb->enabled_slices =
> > > > > > intel_enabled_dbuf_slices(dev_priv);
> > > > > > +     DRM_DEBUG_KMS("Got hw dbuf slices mask %x\n", ddb-
> > > > > > > enabled_slices);
> > > > > > 
> > > > > >  }
> > > > > > 
> > > > > >  /*
> > > > > > @@ -4164,6 +4240,241 @@
> > > > > > skl_get_total_relative_data_rate(struct
> > > > > > intel_crtc_state *crtc_state,
> > > > > >       return total_data_rate;
> > > > > >  }
> > > > > > 
> > > > > > +static uint_fixed_16_16_t
> > > > > > +icl_pipe_downscale_amount(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     u32 src_w, src_h, dst_w, dst_h;
> > > > > > +     uint_fixed_16_16_t fp_w_ratio, fp_h_ratio;
> > > > > > +     uint_fixed_16_16_t downscale_h, downscale_w;
> > > > > > +     const struct drm_display_mode *adjusted_mode =
> > > > > > &crtc_state-
> > > > > > > hw.adjusted_mode;
> > > > > > 
> > > > > > +
> > > > > > +     src_w = crtc_state->pipe_src_w;
> > > > > > +     src_h = crtc_state->pipe_src_h;
> > > > > > +     dst_w = adjusted_mode->crtc_hdisplay;
> > > > > > +     dst_h = adjusted_mode->crtc_vdisplay;
> > > > > > +
> > > > > > +     fp_w_ratio = div_fixed16(src_w, dst_w);
> > > > > > +     fp_h_ratio = div_fixed16(src_h, dst_h);
> > > > > > +     downscale_w = max_fixed16(fp_w_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +     downscale_h = max_fixed16(fp_h_ratio,
> > > > > > u32_to_fixed16(1));
> > > > > > +
> > > > > > +     return mul_fixed16(downscale_w, downscale_h);
> > > > > > +}
> > > > > > +
> > > > > > +uint_fixed_16_16_t
> > > > > > +icl_get_pipe_ratio(const struct intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     struct drm_plane *plane;
> > > > > > +     const struct drm_plane_state *drm_plane_state;
> > > > > > +     uint_fixed_16_16_t pipe_downscale, pipe_ratio;
> > > > > > +     uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
> > > > > > +     struct intel_crtc *crtc = to_intel_crtc(crtc_state-
> > > > > > > uapi.crtc);
> > > > > > 
> > > > > > +
> > > > > > +     if (!crtc_state->uapi.enable)
> > > > > > +             return max_downscale;
> > > > > > +
> > > > > > +     drm_atomic_crtc_state_for_each_plane_state(plane,
> > > > > > drm_plane_state, &crtc_state->uapi) {
> > > > > > +             uint_fixed_16_16_t plane_downscale;
> > > > > > +             const struct intel_plane_state *plane_state =
> > > > > > +                     to_intel_plane_state(drm_plane_state)
> > > > > > ;
> > > > > > +
> > > > > > +             if (!intel_wm_plane_visible(crtc_state,
> > > > > > plane_state))
> > > > > > +                     continue;
> > > > > > +
> > > > > > +             plane_downscale =
> > > > > > skl_plane_downscale_amount(crtc_state, plane_state);
> > > > > > +
> > > > > > +             DRM_DEBUG_KMS("Plane %d downscale amount
> > > > > > %d.%d\n",
> > > > > > +                           to_intel_plane(plane)->id,
> > > > > > +                           plane_downscale.val >> 16,
> > > > > > +                           plane_downscale.val & 0xffff);
> > > > > > +
> > > > > > +             max_downscale = max_fixed16(plane_downscale,
> > > > > > max_downscale);
> > > > > > +     }
> > > > > > +
> > > > > > +
> > > > > > +     pipe_downscale =
> > > > > > icl_pipe_downscale_amount(crtc_state);
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d downscale amount %d.%d\n",
> > > > > > +                    crtc->pipe, pipe_downscale.val >> 16,
> > > > > > +                    pipe_downscale.val & 0xffff);
> > > > > > +
> > > > > > +     pipe_downscale = mul_fixed16(pipe_downscale,
> > > > > > max_downscale);
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Convert result to percentage: get the actual ratio
> > > > > > instead
> > > > > > of rate,
> > > > > > +      * then multiply by 100 to get percentage.
> > > > > > +      */
> > > > > > +     pipe_ratio = u32_to_fixed16(100 *
> > > > > > div_round_up_u32_fixed16(1,
> > > > > > pipe_downscale));
> > > > > > +
> > > > > > +     DRM_DEBUG_KMS("Pipe %d ratio %d.%d\n", crtc->pipe,
> > > > > > +                   pipe_ratio.val >> 16, pipe_ratio.val &
> > > > > > 0xffff);
> > > > > > +
> > > > > > +     return pipe_ratio;
> > > > > > +}
> > > > > 
> > > > > Yuck. I just killed these.
> > > > 
> > > > I still need pipe_ratio fro ICL as you know, this is dictated
> > > > by
> > > > BSpec,
> > > > not by me.
> > > 
> > > That part of the spec is pretty crazy IMO. I think we should just
> > > ignore
> > > it for now. Also nothing in there says we must use both slices
> > > for a
> > > single pipe so ignoring doesn't even violate the spec in any way.
> > 
> > So you are instructing me now to ignore BSpec as "we are smarter"?
> > I don't think this is properly to do it this way. Lets first change
> > the
> > BSpec, if we don't like pipe ratio here and then driver. Currently
> > this
> > is written that way and if we are going to do everything our own
> > way
> > ignoring specs, this is going to be chaos.
> 
> Like I sai you don't have to violate the spec at all. Just don't use
> two slices for the same pipe and you can avoid the whole strange pipe
> ratio limitation. We should file a bspec issue to get that clarified
> though.
> 
> > 
> > > 
> > > > 
> > > > > 
> > > > > > +
> > > > > > +struct dbuf_slice_conf_entry {
> > > > > > +     u32 dbuf_mask[I915_MAX_PIPES];
> > > > > > +     u32 active_pipes;
> > > > > > +};
> > > > > > +
> > > > > > +
> > > > > > +#define ICL_PIPE_A_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { DBuf1, 0, 0, 0 }, BIT(PIPE_A) }
> > > > > > +#define ICL_PIPE_B_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, DBuf1, 0, 0 }, BIT(PIPE_B) }
> > > > > > +#define ICL_PIPE_C_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, 0, DBuf1, 0 }, BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_D_DBUF_SLICES(DBuf1)  \
> > > > > > +     { { 0, 0, 0, DBuf1 }, BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_AB_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, DBuf2, 0, 0 }, BIT(PIPE_A) | BIT(PIPE_B) }
> > > > > > +#define ICL_PIPE_BC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, DBuf1, DBuf2, 0 }, BIT(PIPE_B) | BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_BD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, DBuf1, 0, DBuf2 }, BIT(PIPE_B) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_AC_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, 0, DBuf2, 0 }, BIT(PIPE_A) | BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_AD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { DBuf1, 0, 0, DBuf2 }, BIT(PIPE_A) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_CD_DBUF_SLICES(DBuf1, DBuf2)   \
> > > > > > +     { { 0, 0, DBuf1, DBuf2 }, BIT(PIPE_C) | BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_ACD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, 0, DBuf2, DBuf3 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_C)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_BCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { 0, DBuf1, DBuf2, DBuf3 }, BIT(PIPE_B) |
> > > > > > BIT(PIPE_C)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABD_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, 0, DBuf3 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_D) }
> > > > > > +#define ICL_PIPE_ABC_DBUF_SLICES(DBuf1, DBuf2, DBuf3)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, 0 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B)
> > > > > > > 
> > > > > > 
> > > > > > BIT(PIPE_C) }
> > > > > > +#define ICL_PIPE_ABCD_DBUF_SLICES(DBuf1, DBuf2, DBuf3,
> > > > > > DBuf4)  \
> > > > > > +     { { DBuf1, DBuf2, DBuf3, DBuf4 }, BIT(PIPE_A) |
> > > > > > BIT(PIPE_B) | \
> > > > > > +                                       BIT(PIPE_C) |
> > > > > > BIT(PIPE_D) }
> > > > > > +
> > > > > > +/*
> > > > > > + * Table taken from Bspec 12716
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry icl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),  /*
> > > > > > pipe
> > > > > > ratio
> > > > > > < 88.8 */
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S2_BIT),                /* for
> > > > > > pipe
> > > > > > ratio >= 88.8 */
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > > +
> > > > > > +/*
> > > > > > + * Table taken from Bspec 49255
> > > > > > + * Pipes do have some preferred DBuf slice affinity,
> > > > > > + * plus there are some hardcoded requirements on how
> > > > > > + * those should be distributed for multipipe scenarios.
> > > > > > + * For more DBuf slices algorithm can get even more messy
> > > > > > + * and less readable, so decided to use a table almost
> > > > > > + * as is from BSpec itself - that way it is at least
> > > > > > easier
> > > > > > + * to compare, change and check.
> > > > > > + */
> > > > > > +struct dbuf_slice_conf_entry tgl_allowed_dbufs[] = {
> > > > > > +{ { 0, 0, 0, 0 }, 0 },
> > > > > > +ICL_PIPE_A_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_B_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_C_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_D_DBUF_SLICES(DBUF_S1_BIT | DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_AD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_CD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ACD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_BCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +ICL_PIPE_ABCD_DBUF_SLICES(DBUF_S1_BIT, DBUF_S1_BIT,
> > > > > > DBUF_S2_BIT,
> > > > > > DBUF_S2_BIT),
> > > > > > +};
> > > > > 
> > > > > My eyes!
> > > > > 
> > > > > I have to think we should be able to reduce all that to a
> > > > > handful
> > > > > of lines of code.
> > > > 
> > > > Yeah, then we are going to have a huge function with lots of
> > > > weird
> > > > definitions, unfortunately BSpec table has quite strange DBuf
> > > > assignments like
> > > > +ICL_PIPE_AB_DBUF_SLICES(DBUF_S2_BIT, DBUF_S1_BIT),
> > > > +ICL_PIPE_AC_DBUF_SLICES(DBUF_S1_BIT, DBUF_S2_BIT),
> > > > 
> > > > i.e slices can get mixed in a quite various ways. There is no
> > > > rational pattern in those and could be even dangerous to try to
> > > > optimize it some way, as one day it might change again in some
> > > > unpredictable way.
> > > > 
> > > > Would you prefer to have it like
> > > > 
> > > > if (pipes are A and B)
> > > >     program S2 to A and S1 to B
> > > > if (pipes are A and C)
> > > >     program S1 to A and S2 to C
> > > > ...
> > > > 
> > > > I would prefer at least to see it in a comparable way with the
> > > > table we have in BSpec, rather than having lots of weird
> > > > looking
> > > > if-else statements, GEN checks and so on.
> > > > 
> > > > I knew this table was not going to look like it is done
> > > > typically
> > > > here - but really my opinion that it is way better than
> > > > hardcoding
> > > > it into some weird algorithm, which would be hard to compare to
> > > > initial table, which is already strange enough.
> > > > 
> > > > If you don't like it - could you please explain why this is
> > > > exactly
> > > > worse than having long functions with hardcoded checks?
> > > 
> > > I think we should be able to encode this simply using the
> > > "Pipe and DBUF ordering" stuff from the spec. Ie. just specify
> > > what
> > > is the distance of each pipe from each dbuf slice. Then all we
> > > have
> > > to do is minimize the total distance. The following tables I
> > > think
> > > should encode that reasonably concisely:
> > > 
> > > /* pipeA - DBUF_S1 - pipeB - pipeC - DBUF_S2 */
> > > static const u8 icl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > > };
> > > 
> > > /* pipeD - DBUF_S2 - pipeC - pipeA - DBUF_S1 - pipeB */
> > > static const u8 tgl_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S1] = 1, [DBUF_S2] = 2, },
> > >     [PIPE_B] = { [DBUF_S1] = 1, [DBUF_S2] = 4, },
> > >     [PIPE_C] = { [DBUF_S1] = 2, [DBUF_S2] = 1, },
> > >     [PIPE_D] = { [DBUF_S1] = 4, [DBUF_S2] = 1, },
> > > };
> > 
> > I think you are missing my point. I will explain why.
> > My initial idea was similar, use some kind of distance
> > metrics and encode everything in a smart way.
> > Current BSpec table has sometimes slices swapped for no reason.
> > 
> > There are seem to be some hardware constraints we are not aware
> > of. For example for TGL pipes A,B get S2 and S1 slices,
> 
> A/S1 + B/S2 = 1 + 4 = 5
> A/S2 + B/S1 = 2 + 1 = 3
> 
> > but pipes A C get S1 and S2.
> 
> A/S1 + C/S2 = 1 + 1 = 2
> A/S2 + C/S1 = 2 + 2 = 4
> 
> So it's exactly as my distance table says it should be.
> 
> > 
> > if you encode all this into some kind of distance algorithm,
> > those differences would be enourmously difficult to spot and
> > compare with actual table. That is why I decided to do it that
> > way.
> > 
> > Most likely yes you can solve it by using this metrics,
> > but will it really be then more readable and comparable
> > with reference table?
> > What if some contraints will be added again, like pipe ratio?
> > 
> > > 
> > > /*
> > >  * pipeA - DBUF_S0 - DBUF_S1 - pipeB
> > >  * pipeC - DBUF_S2 - DBUF_S3 - pipeD
> > >  */
> > > static const u8 future_dbuf_distance[][] {
> > >     [PIPE_A] = { [DBUF_S0] = 1, [DBUF_S1] = 3, },
> > >     [PIPE_B] = { [DBUF_S0] = 3, [DBUF_S1] = 1, },
> > >     [PIPE_C] = { [DBUF_S2] = 1, [DBUF_S3] = 3, },
> > >     [PIPE_D] = { [DBUF_S2] = 3, [DBUF_S3] = 1, },
> > > };
> > > 
> > > I suppose we could make that even more minimal by just
> > > directly sticking that "Pipe and DBUF ordering" language
> > > into a single array but and then calculate each distance
> > > from it, but that is perhaps more work, and we'd need
> > > some kind of enum just for that so that we have a single
> > > namespace for the pipes and dbuf slices. So i have a
> > > feeling having those distances pre-computed in a table
> > > like thet is maybe a better way to go.
> > 
> > Good luck then spoting differences with the BSpec table once it
> > gets changed.
> 
> Well, I'd rather do that than decode through layers of cpp macros.
> Also atm the table has no extra restrictions so it's clearly there
> just because someone thought they'd write out the results of the
> distance calculations. In general bspec style is somewhat annoying
> where they try to write out things in the lowest level of absraction
> possible (they are hardware people after all). I'd much prefer a
> higher level description of the issue.
> 
> > 
> > > 
> > > > 
> > > > I think this is pretty stupid to fix stuff which looks new or
> > > > unusual
> > > > just because somebody doesn't like it, without even
> > > > constructive
> > > > arguing.
> > > > 
> > > > - Stan
> > > > 
> > > > > 
> > > > > 
> > > > > > +
> > > > > > +/*
> > > > > > + * This function
> > > > > 
> > > > > "This function" is totally redundant. Ditto elsewhere.
> > > > > 
> > > > > > finds an entry with same enabled pipe configuration and
> > > > > > + * returns correspondent DBuf slice mask as stated in
> > > > > > BSpec
> > > > > > for
> > > > > > particular
> > > > > > + * platform.
> > > > > > + */
> > > > > > +static u32 icl_get_allowed_dbuf_mask(int pipe,
> > > > > > +                                  u32 active_pipes,
> > > > > > +                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     int i;
> > > > > > +     uint_fixed_16_16_t pipe_ratio;
> > > > > > +
> > > > > > +     /*
> > > > > > +      * Calculate pipe ratio as stated in BSpec 28692
> > > > > > +      */
> > > > > > +     pipe_ratio = icl_get_pipe_ratio(crtc_state);
> > > > > > +
> > > > > > +     for (i = 0; i < ARRAY_SIZE(icl_allowed_dbufs); i++) {
> > > > > > +             if (icl_allowed_dbufs[i].active_pipes ==
> > > > > > active_pipes)
> > > > > > {
> > > > > > +                     if (hweight32(active_pipes) == 1) {
> > > > > > +                             /*
> > > > > > +                              * According to BSpec 12716:
> > > > > > if
> > > > > > ratio
> > > > > > > = 88.8
> > > > > > 
> > > > > > +                              * always use single dbuf
> > > > > > slice.
> > > > > > +                              */
> > > > > > +                             if (pipe_ratio.val >=
> > > > > > div_fixed16(888,
> > > > > > 10).val)
> > > > > > +                                     ++i;
> > > > > > +                     }
> > > > > > +                     return
> > > > > > icl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > > +             }
> > > > > > +     }
> > > > > > +     return 0;
> > > > > > +}
> > > > > > +
> > > > > > +/*
> > > > > > + * This function finds an entry with same enabled pipe
> > > > > > configuration and
> > > > > > + * returns correspondent DBuf slice mask as stated in
> > > > > > BSpec
> > > > > > for
> > > > > > particular
> > > > > > + * platform.
> > > > > > + */
> > > > > > +static u32 tgl_get_allowed_dbuf_mask(int pipe,
> > > > > > +                                  u32 active_pipes,
> > > > > > +                                  const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     int i;
> > > > > > +
> > > > > > +     for (i = 0; i < ARRAY_SIZE(tgl_allowed_dbufs); i++) {
> > > > > > +             if (tgl_allowed_dbufs[i].active_pipes ==
> > > > > > active_pipes)
> > > > > > +                     return
> > > > > > tgl_allowed_dbufs[i].dbuf_mask[pipe];
> > > > > > +     }
> > > > > > +     return 0;
> > > > > > +}
> > > > > > +
> > > > > > +u32 i915_get_allowed_dbuf_mask(struct drm_i915_private
> > > > > > *dev_priv,
> > > > > > +                                   int pipe, u32
> > > > > > active_pipes,
> > > > > > +                                   const struct
> > > > > > intel_crtc_state
> > > > > > *crtc_state)
> > > > > > +{
> > > > > > +     if (IS_GEN(dev_priv, 11))
> > > > > > +             return icl_get_allowed_dbuf_mask(pipe,
> > > > > > +                                              active_pipes
> > > > > > ,
> > > > > > +                                              crtc_state);
> > > > > > +     else if (IS_GEN(dev_priv, 12))
> > > > > > +             return tgl_get_allowed_dbuf_mask(pipe,
> > > > > > +                                              active_pipes
> > > > > > ,
> > > > > > +                                              crtc_state);
> > > > > > +     /*
> > > > > > +      * For anything else just return one slice yet.
> > > > > > +      * Should be extended for other platforms.
> > > > > > +      */
> > > > > > +     return DBUF_S1_BIT;
> > > > > > +}
> > > > > > +
> > > > > >  static u64
> > > > > >  icl_get_total_relative_data_rate(struct intel_crtc_state
> > > > > > *crtc_state,
> > > > > >                                u64 *plane_data_rate)
> > > > > > --
> > > > > > 2.17.1
> > > > > 
> > > > > 
> > > 
> > > 
> 
> --
> Ville Syrjälä
> Intel
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx





[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux