On Tue, Feb 15, 2022 at 06:52:51PM +0200, Ville Syrjälä wrote: > On Tue, Feb 15, 2022 at 06:33:42PM +0200, Lisovskiy, Stanislav wrote: > > On Tue, Feb 15, 2022 at 01:26:50PM +0200, Ville Syrjälä wrote: > > > On Tue, Feb 15, 2022 at 01:02:48PM +0200, Lisovskiy, Stanislav wrote: > > > > On Tue, Feb 15, 2022 at 12:10:19PM +0200, Ville Syrjälä wrote: > > > > > On Tue, Feb 15, 2022 at 10:59:57AM +0200, Lisovskiy, Stanislav wrote: > > > > > > On Mon, Feb 14, 2022 at 10:26:39PM +0200, Ville Syrjälä wrote: > > > > > > > On Mon, Feb 14, 2022 at 07:03:05PM +0200, Lisovskiy, Stanislav wrote: > > > > > > > > On Mon, Feb 14, 2022 at 12:24:57PM +0200, Ville Syrjälä wrote: > > > > > > > > > On Mon, Feb 14, 2022 at 12:05:36PM +0200, Lisovskiy, Stanislav wrote: > > > > > > > > > > On Mon, Feb 14, 2022 at 11:18:07AM +0200, Ville Syrjala wrote: > > > > > > > > > > > From: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > > > > > > > > > > > > > > > > > > > If the only thing that is changing is SAGV vs. no SAGV but > > > > > > > > > > > the number of active planes and the total data rates end up > > > > > > > > > > > unchanged we currently bail out of intel_bw_atomic_check() > > > > > > > > > > > early and forget to actually compute the new WGV point > > > > > > > > > > > mask and thus won't actually enable/disable SAGV as requested. > > > > > > > > > > > This ends up poorly if we end up running with SAGV enabled > > > > > > > > > > > when we shouldn't. Usually ends up in underruns. > > > > > > > > > > > To fix this let's go through the QGV point mask computation > > > > > > > > > > > if anyone else already added the bw state for us. > > > > > > > > > > > > > > > > > > > > Haven't been looking this in a while. Despite we have been > > > > > > > > > > looking like few revisions together still some bugs :( > > > > > > > > > > > > > > > > > > > > I thought SAGV vs No SAGV can't change if active planes > > > > > > > > > > or data rate didn't change? Because it means we probably > > > > > > > > > > still have same ddb allocations, which means SAGV state > > > > > > > > > > will just stay the same. > > > > > > > > > > > > > > > > > > SAGV can change due to watermarks/ddb allocations. The easiest > > > > > > > > > way to trip this up is to try to use the async flip wm0/ddb > > > > > > > > > optimization. That immediately forgets to turn off SAGV and > > > > > > > > > we get underruns, whcih is how I noticed this. And I don't > > > > > > > > > immediately see any easy proof that this couldn't also happen > > > > > > > > > due to some other plane changes. > > > > > > > > > > > > > > > > Thats the way it was initially implemented even before SAGV was added. > > > > > > > > > > > > > > Yeah, it wasn't a problem as long as SAGV was not enabled. > > > > > > > > > > > > > > > I think it can be dated back to the very first bw check was implemented. > > > > > > > > > > > > > > > > commit c457d9cf256e942138a54a2e80349ee7fe20c391 > > > > > > > > Author: Ville Syrjälä <ville.syrjala@xxxxxxxxxxxxxxx> > > > > > > > > Date: Fri May 24 18:36:14 2019 +0300 > > > > > > > > > > > > > > > > drm/i915: Make sure we have enough memory bandwidth on ICL > > > > > > > > > > > > > > > > +int intel_bw_atomic_check(struct intel_atomic_state *state) > > > > > > > > +{ > > > > > > > > + struct drm_i915_private *dev_priv = to_i915(state->base.dev); > > > > > > > > + struct intel_crtc_state *new_crtc_state, *old_crtc_state; > > > > > > > > + struct intel_bw_state *bw_state = NULL; > > > > > > > > + unsigned int data_rate, max_data_rate; > > > > > > > > + unsigned int num_active_planes; > > > > > > > > + struct intel_crtc *crtc; > > > > > > > > + int i; > > > > > > > > + > > > > > > > > + /* FIXME earlier gens need some checks too */ > > > > > > > > + if (INTEL_GEN(dev_priv) < 11) > > > > > > > > + return 0; > > > > > > > > + > > > > > > > > + for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state, > > > > > > > > + new_crtc_state, i) { > > > > > > > > + unsigned int old_data_rate = > > > > > > > > + intel_bw_crtc_data_rate(old_crtc_state); > > > > > > > > + unsigned int new_data_rate = > > > > > > > > + intel_bw_crtc_data_rate(new_crtc_state); > > > > > > > > + unsigned int old_active_planes = > > > > > > > > + intel_bw_crtc_num_active_planes(old_crtc_state); > > > > > > > > + unsigned int new_active_planes = > > > > > > > > + intel_bw_crtc_num_active_planes(new_crtc_state); > > > > > > > > + > > > > > > > > + /* > > > > > > > > + * Avoid locking the bw state when > > > > > > > > + * nothing significant has changed. > > > > > > > > + */ > > > > > > > > + if (old_data_rate == new_data_rate && > > > > > > > > + old_active_planes == new_active_planes) > > > > > > > > + continue; > > > > > > > > + > > > > > > > > + bw_state = intel_atomic_get_bw_state(state); > > > > > > > > + if (IS_ERR(bw_state)) > > > > > > > > + return PTR_ERR(bw_state); > > > > > > > > > > > > > > > > However, what can cause watermarks/ddb to change, besides plane state change > > > > > > > > and/or active planes change? We change watermarks, when we change ddb allocations > > > > > > > > and we change ddb allocations when active planes had changed and/or data rate > > > > > > > > had changed. > > > > > > > > > > > > > > The bw code only cares about the aggregate numbers from all the planes. > > > > > > > The planes could still change in some funny way where eg. some plane > > > > > > > frees up some bandwidth, but the other planes gobble up the exact same > > > > > > > amount and thus the aggregate numbers the bw atomic check cares about > > > > > > > do not change but the watermarks/ddb do. > > > > > > > > > > > > > > And as mentiioned, the async flip wm0/ddb optimization makes this trivial > > > > > > > to trip up since it will want to disable SAGV as there is not enough ddb > > > > > > > for the SAGV watermark. And async flip specifically isn't even allowed > > > > > > > to change anything that would affect the bandwidth utilization, and neither > > > > > > > is it allowed to enable/disable planes. > > > > > > > > > > > > I think the whole idea of setting ddb to minimum in case of async flip optimization > > > > > > was purely our idea - BSpec/HSD only mentions forbidding wm levels > 0 in case of async > > > > > > flip, however there is nothing about limiting ddb allocations. > > > > > > > > > > Reducing just the watermark doesn't really make sense > > > > > if the goal is to keep the DBUF level to a minimum. Also > > > > > I don't think there is any proper docs for this thing. The > > > > > only thing we have just has some vague notes about using > > > > > "minimum watermarks", whatever that means. > > > > > > > > Was it the goal? I thought limiting watermarks would by itself also > > > > limit package C states, thus affecting memory clocks and latency. > > > > Because it really doesn't say anything about keeping Dbuf allocations > > > > to a minimum. > > > > > > The goal is to miminize the amount of data in the FIFO. > > > > > > > > > > > > > > > > > > > > > > > > Was a bit suspicious about that whole change, to be honest - and yep, now it seems to > > > > > > cause some unexpected side effects. > > > > > > > > > > The bw_state vs. SAGV bug is there regardless of the wm0 optimization. > > > > > > > > I agree there is a bug. The bug is such that initial bw checks were relying > > > > on total data rate + active planes comparison, while it should have accounted > > > > data rate per plane usage. > > > > > > > > This should have been changed in SAGV patches, but probably had gone > > > > unnoticed both by you and me. > > > > > > > > > > > > > > Also the SAGV watermark is not the minimum watermark (if that is > > > > > the doc really means by that), the normal WM0 is the minimum watermark. > > > > > So even if we interpret the doc to say that we should just disable all > > > > > watermark levels except the smallest one (normal WM0) without changing > > > > > the ddb allocations we would still end up disabling SAGV. > > > > > > > > Thats actually a good question. Did they mean, disable all "regular" wm levels > > > > or the SAGV one also? Probably they meant what you say, but would be nice to know > > > > exactly. > > > > > > They said neither. It's just "program minimum watermarks" which > > > could mean anything really. They do explicitly say "DBUF level > > > can also adversely affect flip performance." which I think is > > > the whole point of this exercise. > > > > > > > > > > > Anyway my point here is that, we probably shouldn't use new_bw_state as a way to > > > > check that plane allocations had changed. Thats just confusing. > > > > > > We are not checking if plane allocations have changed. We are > > > trying to determine if anything in the bw_state has changed. > > > If we have said state already then something in it may have > > > changed and we have to recalculate anything that may depend > > > on those changed things, namely pipe_sagv_reject->qgv_point_mask. > > > > I think it is just not very intuitive that we use the fact whether > > we can get new_bw_state or not, as a way to check if something had > > changed. > > Would be nice to put it in somekind of a wrapper like "has_new_bw_state" > > or "bw_state_changed". Because for anyone not quite familiar with > > that state paradigm we use, that would look pretty confusing that first > > we get new_bw_state using intel_atomic_get_new_bw_state, then immediately > > override it with intel_atomic_get_bw_state. > > And whether we can get new_bw_state or not is just acting like a check, > > that we don't have anything changed in bw_state. > > I think the only thing we'd achieve is something like this: > > new_bw_state = NULL; > if (has_new_bw_state()) > new_bw_state = get_new_bw_state(); > ... > if (!new_bw_state) > return 0; > > instead of just > > new_bw_state = get_new_bw_state(); > ... > if (!new_bw_state) > return 0; > > I don't know why that would be an improvement. Though I suppose a comment might be in order pointing the reader towards intel_compute_sagv_mask(). -- Ville Syrjälä Intel