Re: [RFC v4 1/2] drm/i915: Engine discovery uAPI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 28/06/2017 12:27, Chris Wilson wrote:
Quoting Tvrtko Ursulin (2017-06-26 16:47:41)
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

Engine discovery uAPI allows userspace to probe for engine
configuration and features without needing to maintain the
internal PCI id based database.

This enables removal of code duplications across userspace
components.

Probing is done via the new DRM_IOCTL_I915_GEM_ENGINE_INFO ioctl
which returns the number and information on the specified engine
class.

Currently only general engine configuration and HEVC feature of
the VCS engine can be probed but the uAPI is designed to be
generic and extensible.

Code is based almost exactly on the earlier proposal on the
topic by Jon Bloomfield. Engine class and instance refactoring
made recently by Daniele Ceraolo Spurio enabled this to be
implemented in an elegant fashion.

To probe configuration userspace sets the engine class it wants
to query (struct drm_i915_gem_engine_info) and provides an array
of drm_i915_engine_info structs which will be filled in by the
driver. Userspace also has to tell i915 how many elements are in
the array, and the driver will report back the total number of
engine instances in any case.

v2:
  * Add a version field and consolidate to one engine count.
    (Chris Wilson)
  * Rename uAPI flags for VCS engines to DRM_I915_ENGINE_CLASS_VIDEO.
    (Gong Zhipeng)

v3:
  * Drop the DRM_ prefix from uAPI enums. (Chris Wilson)
  * Only reserve 8-bits for the engine info for time being.

v4:
  * Made user_class_map static.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Ben Widawsky <ben@xxxxxxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Daniel Vetter <daniel.vetter@xxxxxxxxx>
Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>
Cc: Jon Bloomfield <jon.bloomfield@xxxxxxxxx>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@xxxxxxxxx>
Cc: Oscar Mateo <oscar.mateo@xxxxxxxxx>
Cc: "Gong, Zhipeng" <zhipeng.gong@xxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_drv.c        |  1 +
  drivers/gpu/drm/i915/i915_drv.h        |  3 ++
  drivers/gpu/drm/i915/intel_engine_cs.c | 64 ++++++++++++++++++++++++++++++++++
  include/uapi/drm/i915_drm.h            | 41 ++++++++++++++++++++++
  4 files changed, 109 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e9d8e9ee51b2..44dd2f37937f 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -2724,6 +2724,7 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
         DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
         DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
         DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
+       DRM_IOCTL_DEF_DRV(I915_GEM_ENGINE_INFO, i915_gem_engine_info_ioctl, DRM_RENDER_ALLOW),
  };
static struct drm_driver driver = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 427d10c7eca5..496565e1753f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3598,6 +3598,9 @@ void i915_oa_init_reg_state(struct intel_engine_cs *engine,
                             struct i915_gem_context *ctx,
                             uint32_t *reg_state);
+int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
+                              struct drm_file *file);
+
  /* i915_gem_evict.c */
  int __must_check i915_gem_evict_something(struct i915_address_space *vm,
                                           u64 min_size, u64 alignment,
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 3b46c1f7b88b..a98669f6ad85 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -25,6 +25,7 @@
  #include "i915_drv.h"
  #include "intel_ringbuffer.h"
  #include "intel_lrc.h"
+#include <uapi/drm/i915_drm.h>
/* Haswell does have the CXT_SIZE register however it does not appear to be
   * valid. Now, docs explain in dwords what is in the context object. The full
@@ -1332,6 +1333,69 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
         }
  }
+static u8 user_class_map[I915_ENGINE_CLASS_MAX] = {
+       [I915_ENGINE_CLASS_OTHER] = OTHER_CLASS,
+       [I915_ENGINE_CLASS_RENDER] = RENDER_CLASS,
+       [I915_ENGINE_CLASS_COPY] = COPY_ENGINE_CLASS,
+       [I915_ENGINE_CLASS_VIDEO] = VIDEO_DECODE_CLASS,
+       [I915_ENGINE_CLASS_VIDEO_ENHANCE] = VIDEO_ENHANCEMENT_CLASS,
+};
+
+int i915_gem_engine_info_ioctl(struct drm_device *dev, void *data,
+                              struct drm_file *file)
+{
+       struct drm_i915_private *i915 = to_i915(dev);
+       struct drm_i915_gem_engine_info *args = data;
+       struct drm_i915_engine_info __user *user_info =
+                                       u64_to_user_ptr(args->info_ptr);
+       unsigned int info_size = args->num_engines;
+       struct drm_i915_engine_info info;
+       struct intel_engine_cs *engine;
+       enum intel_engine_id id;
+       u8 class;
+
+       if (args->rsvd)
+               return -EINVAL;
+
+       switch (args->engine_class) {
+       case I915_ENGINE_CLASS_OTHER:
+       case I915_ENGINE_CLASS_RENDER:
+       case I915_ENGINE_CLASS_COPY:
+       case I915_ENGINE_CLASS_VIDEO:
+       case I915_ENGINE_CLASS_VIDEO_ENHANCE:
+               class = user_class_map[args->engine_class];
+               break;
+       case I915_ENGINE_CLASS_MAX:
+       default:
+               return -EINVAL;
+       };
+
+       args->num_engines = 0;
+
+       if (args->version != 1) {
+               args->version = 1;
+               return 0;

My gut feeling says we want to report both the version and count in the
first pass. Otherwise we have to do a version check, followed by a count
query, followed by the actual retrieval. If we can combine the version +
count check into one pass, it may be a little less hassle?

Could do, I don't see any downsides to it at the moment.

But the only benefit seems it would be in cases where userspace would want to allocate the info array to the exact size, rather than just using a large enough number.

In both cases userspace has to check both return code from the ioctl and that the version number matches the requested one. And in both cases the whole operation can be done with just one ioctl. Or if you want to allocate the exact size info array in both cases you need two ioctls.

info.version = 1;
info.class = <some class>;
info.num_engines = <a large number should be enough for everyone>;
ret = ioctl(&info);
if (ret || info.version != 1)
	goto try_a_lower_version;	
for_each_engine(info) {
	...
}

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux