[PATCH] drm/i915: Document the Virtual Engine uAPI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

A little bit of documentation covering the topics of engine discovery,
context engine maps and virtual engines. It is not very detailed but
supposed to be a starting point of giving a brief high level overview of
general principles and intended use cases.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
---
 Documentation/gpu/i915.rst | 184 +++++++++++++++++++++++++++++++++++++
 1 file changed, 184 insertions(+)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 42ce0196930a..8e5ab299c31f 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -335,6 +335,190 @@ for execution also include a list of all locations within buffers that
 refer to GPU-addresses so that the kernel can edit the buffer correctly.
 This process is dubbed relocation.
 
+Engine Discovery uAPI
+---------------------
+
+Engine discovery uAPI is a way of enumerating physical engines present in a GPU
+associated with an open i915 DRM file descriptor. This supersedes the old way of
+using `DRM_IOCTL_I915_GETPARAM` and engine identifiers like
+`I915_PARAM_HAS_BLT`.
+
+The need for this interface came starting with Icelake and newer GPUs, which
+started to establish a pattern of having multiple engines of a same class, where
+not all instances were always completely functionally equivalent.
+
+Entry point for this uapi is `DRM_IOCTL_I915_QUERY` with the
+`DRM_I915_QUERY_ENGINE_INFO` as the queried item id.
+
+Example for getting the list of engines:
+
+.. code-block:: C
+
+	struct drm_i915_query_engine_info *info;
+	struct drm_i915_query_item item = {
+		.query_id = DRM_I915_QUERY_ENGINE_INFO;
+	};
+	struct drm_i915_query query = {
+		.num_items = 1,
+		.items_ptr = (uintptr_t)&item,
+	};
+	int err, i;
+
+	// First query the size of the blob we need, this needs to be large
+	// enough to hold our array of engines. The kernel will fill out the
+	// item.length for us, which is the number of bytes we need.
+	//
+	// Alternatively a large buffer can be allocated straight away enabling
+	// querying in one pass, in which case item.length should contain the
+	// length of the provided buffer.
+	err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
+	if (err) ...
+
+	info = calloc(1, item.length);
+	// Now that we allocated the required number of bytes, we call the ioctl
+	// again, this time with the data_ptr pointing to our newly allocated
+	// blob, which the kernel can then populate with info on all engines.
+	item.data_ptr = (uintptr_t)&info,
+
+	err = ioctl(fd, DRM_IOCTL_I915_QUERY, &query);
+	if (err) ...
+
+	// We can now access each engine in the array
+	for (i = 0; i < info->num_engines; i++) {
+		struct drm_i915_engine_info einfo = info->engines[i];
+		u16 class = einfo.engine.class;
+		u16 instance = einfo.engine.instance;
+		....
+	}
+
+	free(info);
+
+Each of the enumerated engines, apart from being defined by its class and
+instance (see `struct i915_engine_class_instance`), also can have flags and
+capabilities defined as documented in i915_drm.h.
+
+For instance video engines which support HEVC encoding will have the
+`I915_VIDEO_CLASS_CAPABILITY_HEVC` capability bit set.
+
+Engine discovery only fully comes to its own when combined with the new way of
+addressing engines when submitting batch buffers using contexts with engine
+maps configured.
+
+Context Engine Map uAPI
+-----------------------
+
+Context engine map is a new way of addressing engines when submitting batch-
+buffers, replacing the existing way of using identifiers like `I915_EXEC_BLT`
+inside the flags field of `struct drm_i915_gem_execbuffer2`.
+
+To use it created GEM contexts need to be configured with a list of engines
+the user is intending to submit to. This is accomplished using the
+`I915_CONTEXT_PARAM_ENGINES` parameter and `struct i915_context_param_engines`.
+
+For such contexts the `I915_EXEC_RING_MASK` field becomes an index into the
+configured map.
+
+Example of creating such context and submitting against it:
+
+.. code-block:: C
+
+	I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 2) = {
+		.engines = { { I915_ENGINE_CLASS_RENDER, 0 },
+			     { I915_ENGINE_CLASS_COPY, 0 } }
+	};
+	struct drm_i915_gem_context_create_ext_setparam p_engines = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_ENGINES,
+			.value = to_user_pointer(&engines),
+			.size = sizeof(engines),
+		},
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p_engines);
+	};
+
+	ctx_id = gem_context_create_ext(drm_fd, &create);
+
+	// We have now created a GEM context with two engines in the map:
+	// Index 0 points to rcs0 while index 1 points to bcs0. Other engines
+	// will not be accessible from this context.
+
+	...
+	execbuf.rsvd1 = ctx_id;
+	execbuf.flags = 0; // Submits to index0, which is rcs0 for this context
+	gem_execbuf(drm_fd, &execbuf);
+
+	...
+	execbuf.rsvd1 = ctx_id;
+	execbuf.flags = 1; // Submits to index0, which is bcs0 for this context
+	gem_execbuf(drm_fd, &execbuf);
+
+Virtual Engine uAPI
+-------------------
+
+Virtual engine is a concept where userspace is able to configure a set of
+physical engines, submit a batch buffer, and let the driver execute it on any
+engine from the set as it sees fit.
+
+This is primarily useful on parts which have multiple instances of a same class
+engine, like for example GT3+ Skylake parts with their two VCS engines.
+
+For instance userspace can enumerate all engines of a certain class using the
+previously described `Engine Discovery uAPI`_. After
+that userspace can create a GEM context with a placeholder slot for the virtual
+engine (using `I915_ENGINE_CLASS_INVALID` and `I915_ENGINE_CLASS_INVALID_NONE`
+for class and instance respectively) and finally using the
+`I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE` extension place a virtual engine in the
+same reserved slot.
+
+Example of creating a virtual engine and submitting a batch buffer to it:
+
+.. code-block:: C
+
+	I915_DEFINE_CONTEXT_ENGINES_LOAD_BALANCE(virtual, 2) = {
+		.base.name = I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE,
+		.engine_index = 0, // Place this virtual engine into engine map slot 0
+		.num_siblings = 2,
+		.engines = { { I915_ENGINE_CLASS_VIDEO, 0 },
+			     { I915_ENGINE_CLASS_VIDEO, 1 }, },
+	};
+	I915_DEFINE_CONTEXT_PARAM_ENGINES(engines, 1) = {
+		.engines = { { I915_ENGINE_CLASS_INVALID,
+			       I915_ENGINE_CLASS_INVALID_NONE } },
+		.extensions = to_user_pointer(&virtual), // Chains with the load_balance extension
+	};
+	struct drm_i915_gem_context_create_ext_setparam p_engines = {
+		.base = {
+			.name = I915_CONTEXT_CREATE_EXT_SETPARAM,
+		},
+		.param = {
+			.param = I915_CONTEXT_PARAM_ENGINES,
+			.value = to_user_pointer(&engines),
+			.size = sizeof(engines),
+		},
+	};
+	struct drm_i915_gem_context_create_ext create = {
+		.flags = I915_CONTEXT_CREATE_FLAGS_USE_EXTENSIONS,
+		.extensions = to_user_pointer(&p_engines);
+	};
+
+	ctx_id = gem_context_create_ext(drm_fd, &create);
+
+	// Now we have created a GEM context with its engine map containing a
+	// single virtual engine. Submissions to this slot can go either to
+	// vcs0 or vcs1, depending on the load balancing algorithm used inside
+	// the driver. The load balancing is dynamic from one batch buffer to
+	// another and transparent to userspace.
+
+	...
+	execbuf.rsvd1 = ctx_id;
+	execbuf.flags = 0; // Submits to index0 which is the virtual engine
+	gem_execbuf(drm_fd, &execbuf);
+
 Locking Guidelines
 ------------------
 
-- 
2.30.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx



[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux