Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> writes: > On 28/01/2022 22:10, Michael Cheng wrote: >> Re-work invalidate_csb_entries to use drm_clflush_virt_range. This will >> prevent compiler errors when building for non-x86 architectures. >> >> Signed-off-by: Michael Cheng <michael.cheng@xxxxxxxxx> >> --- >> drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c >> index 960a9aaf4f3a..90b5daf9433d 100644 >> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c >> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c >> @@ -1647,8 +1647,8 @@ cancel_port_requests(struct intel_engine_execlists * const execlists, >> >> static void invalidate_csb_entries(const u64 *first, const u64 *last) >> { >> - clflush((void *)first); >> - clflush((void *)last); >> + drm_clflush_virt_range((void *)first, sizeof(*first)); >> + drm_clflush_virt_range((void *)last, sizeof(*last)); > > How about dropping the helper and from the single call site do: > > drm_clflush_virt_range(&buf[0], num_entries * sizeof(buf[0])); > > One less function call and CSB is a single cacheline before Gen11 ayway, > two afterwards, so overall better conversion I think. How does that sound? It would definitely work. Now trying to remember why it went into explicit clflushes: iirc as this is gpu/cpu coherency, the wbinvd_on_all_cpus we get with *virt_range would then be just unnecessary perf hit. -Mika > > Regards, > > Tvrtko > >> } >> >> /* >>