On 2023-09-11 22:16, Matthew Brost wrote: > Provide documentation to guide in ways to teardown an entity. > > Signed-off-by: Matthew Brost <matthew.brost@xxxxxxxxx> > --- > Documentation/gpu/drm-mm.rst | 6 ++++++ > drivers/gpu/drm/scheduler/sched_entity.c | 19 +++++++++++++++++++ > 2 files changed, 25 insertions(+) > > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst > index c19b34b1c0ed..cb4d6097897e 100644 > --- a/Documentation/gpu/drm-mm.rst > +++ b/Documentation/gpu/drm-mm.rst > @@ -552,6 +552,12 @@ Overview > .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c > :doc: Overview > > +Entity teardown > +--------------- > + > +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_entity.c > + :doc: Entity teardown > + > Scheduler Function References > ----------------------------- > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c > index 37557fbb96d0..76f3e10218bb 100644 > --- a/drivers/gpu/drm/scheduler/sched_entity.c > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > @@ -21,6 +21,25 @@ > * > */ > > +/** > + * DOC: Entity teardown > + * > + * Drivers can teardown down an entity for several reasons. Reasons typically > + * are a user closes the entity via an IOCTL, the FD associated with the entity > + * is closed, or the entity encounters an error. So in this third case, "entity encounters an error", we need to make sure that no new jobs are being pushed to the entity, or at least say that here. IOW, in all three cases, the common denominator (requirement?) is that no new jobs are being pushed to the entity, i.e. that there are no incoming jobs. > The GPU scheduler provides the > + * basic infrastructure to do this in a few different ways. Well, I'd say "in two different ways." or "in the following two ways." I'd rather have "two" in there to make sure that it is these two, and not any more/less/etc. > + * > + * 1. Let the entity run dry (both the pending list and job queue) and then call > + * drm_sched_entity_fini. The backend can accelerate the process of running dry. > + * For example set a flag so run_job is a NOP and set the TDR to a low value to > + * signal all jobs in a timely manner (this example works for > + * DRM_SCHED_POLICY_SINGLE_ENTITY). > + * > + * 2. Kill the entity directly via drm_sched_entity_flush / > + * drm_sched_entity_fini ensuring all pending and queued jobs are off the > + * hardware and signaled. > + */ > + > #include <linux/kthread.h> > #include <linux/slab.h> > #include <linux/completion.h> -- Regards, Luben