Quoting Mika Kuoppala (2020-08-14 19:41:14) > Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> writes: > > > Since we expect to inline the csb_parse() routines, the w/a for the > > stale CSB data on Tigerlake will be pulled into process_csb(), and so we > > might as well simply reuse the logic for all, and so will hopefully > > avoid any strange behaviour on Icelake that was not covered by our > > previous w/a. > > > > References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries") > > Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > Cc: Mika Kuoppala <mika.kuoppala@xxxxxxxxxxxxxxx> > > Cc: Bruce Chang <yu.bruce.chang@xxxxxxxxx> > > --- > > drivers/gpu/drm/i915/gt/intel_lrc.c | 70 +++++++++++++++++------------ > > 1 file changed, 42 insertions(+), 28 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > > index 3b8161c6b601..c176a029f27b 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > > +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > > @@ -2496,25 +2496,11 @@ invalidate_csb_entries(const u64 *first, const u64 *last) > > * bits 47-57: sw context id of the lrc the GT switched away from > > * bits 58-63: sw counter of the lrc the GT switched away from > > */ > > -static inline bool gen12_csb_parse(const u64 *csb) > > +static inline bool gen12_csb_parse(const u64 csb) > > { > > - bool ctx_away_valid; > > - bool new_queue; > > - u64 entry; > > - > > - /* XXX HSD */ > > - entry = READ_ONCE(*csb); > > - if (unlikely(entry == -1)) { > > - preempt_disable(); > > - if (wait_for_atomic_us((entry = READ_ONCE(*csb)) != -1, 50)) > > If we get this deep into desperation, should we start to apply more > pressure. Ie, rmb instead of just instructing the compiler. And could also > start to invalidate the entry which obviously if of no use. I had a rmb() here; removing it did not appear to make any difference whatsoever to the average delay. The extreme case would be a full mb(); clflush(); mb() read. I haven't timed the average for that.... > It could even be that the invalidate pays out as the correct value > bubbles throught hierarchy faster? I had the same thought... But atm my feeling is the issue is not on the CPU side (or at least controllable from our code on the CPU). -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx