[PATCH 1/3] drm/scheduler: Add more documentation

Asahi Lina <lina@xxxxxxxxxxxxx> · Fri, 14 Jul 2023 17:21:29 +0900

Document the implied lifetime rules of the scheduler (or at least the
intended ones), as well as the expectations of how resource acquisition
should be handled.

Signed-off-by: Asahi Lina <lina@xxxxxxxxxxxxx>
---
 drivers/gpu/drm/scheduler/sched_main.c | 58 ++++++++++++++++++++++++++++++++--
 1 file changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 7b2bfc10c1a5..1f3bc3606239 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -43,9 +43,61 @@
  *
  * The jobs in a entity are always scheduled in the order that they were pushed.
  *
- * Note that once a job was taken from the entities queue and pushed to the
- * hardware, i.e. the pending queue, the entity must not be referenced anymore
- * through the jobs entity pointer.
+ * Lifetime rules
+ * --------------
+ *
+ * Getting object lifetimes right across the stack is critical to avoid UAF
+ * issues. The DRM scheduler has the following lifetime rules:
+ *
+ * - The scheduler must outlive all of its entities.
+ * - Jobs pushed to the scheduler are owned by it, and must only be freed
+ *   after the free_job() callback is called.
+ * - Scheduler fences are reference-counted and may outlive the scheduler.
+ * - The scheduler *may* be destroyed while jobs are still in flight.
+ * - There is no guarantee that all jobs have been freed when all entities
+ *   and the scheduled have been destroyed. Jobs may be freed asynchronously
+ *   after this point.
+ * - Once a job is taken from the entity's queue and pushed to the hardware,
+ *   i.e. the pending queue, the entity must not be referenced any more
+ *   through the job's entity pointer. In other words, entities are not
+ *   required to outlive job execution.
+ *
+ * If the scheduler is destroyed with jobs in flight, the following
+ * happens:
+ *
+ * - Jobs that were pushed but have not yet run will be destroyed as part
+ *   of the entity cleanup (which must happen before the scheduler itself
+ *   is destroyed, per the first rule above). This signals the job
+ *   finished fence with an error flag. This process runs asynchronously
+ *   after drm_sched_entity_destroy() returns.
+ * - Jobs that are in-flight on the hardware are "detached" from their
+ *   driver fence (the fence returned from the run_job() callback). In
+ *   this case, it is up to the driver to ensure that any bookkeeping or
+ *   internal data structures have separately managed lifetimes and that
+ *   the hardware either cancels the jobs or runs them to completion.
+ *   The DRM scheduler itself will immediately signal the job complete
+ *   fence (with an error flag) and then call free_job() as part of the
+ *   cleanup process.
+ *
+ * After the scheduler is destroyed, drivers *may* (but are not required to)
+ * skip signaling their remaining driver fences, as long as they have only ever
+ * been returned to the scheduler being destroyed as the return value from
+ * run_job() and not passed anywhere else. If these fences are used in any other
+ * context, then the driver *must* signal them, per the usual fence signaling
+ * rules.
+ *
+ * Resource management
+ * -------------------
+ *
+ * Drivers may need to acquire certain hardware resources (e.g. VM IDs) in order
+ * to run a job. This process must happen during the job's prepare() callback,
+ * not in the run() callback. If any resource is unavailable at job prepare time,
+ * the driver must return a suitable fence that can be waited on to wait for the
+ * resource to (potentially) become available.
+ *
+ * In order to avoid deadlocks, drivers must always acquire resources in the
+ * same order, and release them in opposite order when a job completes or if
+ * resource acquisition fails.
  */
 
 #include <linux/kthread.h>

-- 
2.40.1