Quoting Lionel Landwerlin (2019-06-27 15:27:40) > On 27/06/2019 12:53, Chris Wilson wrote: > > Quoting Lionel Landwerlin (2019-06-27 09:00:43) > >> We would like to make use of perf in Vulkan. The Vulkan API is much > >> lower level than OpenGL, with applications directly exposed to the > >> concept of command buffers (pretty much equivalent to our batch > >> buffers). In Vulkan, queries are always limited in scope to a command > >> buffer. In OpenGL, the lack of command buffer concept meant that > >> queries' duration could span multiple command buffers. > >> > >> With that restriction gone in Vulkan, we would like to simplify > >> measuring performance just by measuring the deltas between the counter > >> snapshots written by 2 MI_RECORD_PERF_COUNT commands, rather than the > >> more complex scheme we currently have in the GL driver, using 2 > >> MI_RECORD_PERF_COUNT commands and doing some post processing on the > >> stream of OA reports, coming from the global OA buffer, to remove any > >> unrelated deltas in between the 2 MI_RECORD_PERF_COUNT. > >> > >> Disabling preemption only apply to a single context with which want to > >> query performance counters for and is considered a privileged > >> operation, by default protected by CAP_SYS_ADMIN. It is possible to > >> enable it for a normal user by disabling the paranoid stream setting. > >> > >> v2: Store preemption setting in intel_context (Chris) > >> > >> v3: Use priorities to avoid preemption rather than the HW mechanism > >> > >> v4: Just modify the port priority reporting function > >> > >> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx> > >> --- > >> drivers/gpu/drm/i915/gt/intel_lrc.c | 7 ++++++- > >> drivers/gpu/drm/i915/i915_drv.c | 2 +- > >> drivers/gpu/drm/i915/i915_drv.h | 13 ++++++++++++ > >> drivers/gpu/drm/i915/i915_perf.c | 22 +++++++++++++++++++-- > >> drivers/gpu/drm/i915/i915_priolist_types.h | 7 +++++++ > >> drivers/gpu/drm/i915/i915_request.c | 1 + > >> drivers/gpu/drm/i915/i915_request.h | 1 + > >> drivers/gpu/drm/i915/intel_guc_submission.c | 12 ++++++++++- > >> include/uapi/drm/i915_drm.h | 11 +++++++++++ > >> 9 files changed, 71 insertions(+), 5 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c > >> index a6ea468986d1..52f4d69cdb2f 100644 > >> --- a/drivers/gpu/drm/i915/gt/intel_lrc.c > >> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.c > >> @@ -256,7 +256,12 @@ static inline int rq_prio(const struct i915_request *rq) > >> > >> static int effective_prio(const struct i915_request *rq) > >> { > >> - int prio = rq_prio(rq); > >> + int prio; > >> + > >> + if (rq->has_perf) > >> + prio = I915_USER_PRIORITY(I915_PRIORITY_PERF); > >> + else > >> + prio = rq_prio(rq); > > This doesn't cover everything. You will get timesliced now. > > > > But I do think this is the safest approach that evades all the PI sanity > > checks and possible deadlocks. > > > > There is also the problem where we check the priority on the last > > request in the queue, but preemption may need to be disabled at the head > > of the queue (the executing request). Is that a concern? I was thinking > > of marking the context as non-preemptible until after the oa is removed. > > > What do you mean by timesliced? Actually just sent the patch that would allow this to apply to the other preemption path. > I don't really have a good way of working around the global counters so > whatever you think is best to allow queries in batch buffers to complete. Gut feeling is that if you need preemption protection, it should be on the context. Otherwise it seems easy to create a situation where the request being tracked for the context submission doesn't have the oa flag, but is masking one that does. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx