Re: [PATCH v4 1/5] drm: Move some options to separate new Kconfig

Philipp Stanner <phasta@xxxxxxxxxxx> · Mon, 10 Mar 2025 13:00:12 +0100

On Mon, 2025-03-10 at 11:54 +0000, Tvrtko Ursulin wrote:
> 
> On 10/03/2025 11:11, Philipp Stanner wrote:
> > On Mon, 2025-03-10 at 09:55 +0000, Tvrtko Ursulin wrote:
> > > 
> > > On 07/03/2025 18:06, Philipp Stanner wrote:
> > > > On Fri, 2025-03-07 at 16:59 +0000, Tvrtko Ursulin wrote:
> > > > > 
> > > > > On 07/03/2025 13:41, Philipp Stanner wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > You forgot to put folks in CC as recipents for the cover
> > > > > > letter
> > > > > > :(
> > > > > > 
> > > > > > 
> > > > > > On Thu, 2025-03-06 at 17:05 +0000, Tvrtko Ursulin wrote:
> > > > > > > Move some options out into a new debug specific kconfig
> > > > > > > file
> > > > > > > in
> > > > > > > order
> > > > > > > to
> > > > > > > make things a bit cleaner.
> > > > > > > 
> > > > > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>
> > > > > > > Cc: Christian König <christian.koenig@xxxxxxx>
> > > > > > > Cc: Danilo Krummrich <dakr@xxxxxxxxxx>
> > > > > > > Cc: Matthew Brost <matthew.brost@xxxxxxxxx>
> > > > > > > Cc: Philipp Stanner <phasta@xxxxxxxxxx>
> > > > > > 
> > > > > > We all have our individual work flows, so don't take this
> > > > > > as
> > > > > > lecturing
> > > > > > or anything – I just suspect that I was forgotten in the
> > > > > > cover
> > > > > > letter
> > > > > > because you Cc people by hand in the individual patches.
> > > > > > 
> > > > > > What I do is that I run get_maintainer and then put the
> > > > > > individuals
> > > > > > listed there into the --to= field. That sends the entire
> > > > > > series
> > > > > > to
> > > > > > all
> > > > > > of them.
> > > > > > 
> > > > > > Only sometimes, when there's a huge list of recipents or
> > > > > > when
> > > > > > the
> > > > > > patches of a series are very independent, I deviate from
> > > > > > that
> > > > > > rule.
> > > > > > 
> > > > > > JFYI
> > > > > 
> > > > > Notice it was there in v3, I just omitted to paste it this
> > > > > time.
> > > > > 
> > > > > > Anyways, we have a bigger problem about the entire series.
> > > > > > I
> > > > > > now
> > > > > > tested
> > > > > > again with the same setup as yesterday and the faults are
> > > > > > indeed
> > > > > > gone,
> > > > > > so that's good.
> > > > > > 
> > > > > > But to be sure I then did run kmemleak and got a list of
> > > > > > leaks
> > > > > > that
> > > > > > is
> > > > > > more than 2000 lines long.
> > > > > 
> > > > > There is this comment for drm_sched_fini which ends with:
> > > > > 
> > > > > """
> > > > > ...
> > > > >     * This stops submission of new jobs to the hardware
> > > > > through
> > > > >     * drm_sched_backend_ops.run_job(). Consequently,
> > > > > drm_sched_backend_ops.free_job()
> > > > >     * will not be called for all jobs still in
> > > > > drm_gpu_scheduler.pending_list.
> > > > >     * There is no solution for this currently. Thus, it is up
> > > > > to
> > > > > the
> > > > > driver to make
> > > > >     * sure that:
> > > > >     *
> > > > >     *  a) drm_sched_fini() is only called after for all
> > > > > submitted
> > > > > jobs
> > > > >     *     drm_sched_backend_ops.free_job() has been called or
> > > > > that
> > > > >     *  b) the jobs for which drm_sched_backend_ops.free_job()
> > > > > has
> > > > > not
> > > > > been
> > > > > called
> > > > >     *     after drm_sched_fini() ran are freed manually.
> > > > >     *
> > > > > 
> > > > >     * FIXME: Take care of the above problem and prevent this
> > > > > function
> > > > > from
> > > > > leaking
> > > > >     * the jobs in drm_gpu_scheduler.pending_list under any
> > > > > circumstances.
> > > > > """
> > > > > 
> > > > > I got bitten by that. Keep forgetting how fragile the thing
> > > > > is..
> > > > > :(
> > > > 
> > > > argh damn, those are *all* from the pending_list?!
> > > 
> > > Right, all leaks I saw were from the
> > > drm_sched_basic_entity_cleanup
> > > test. All other tests actually wait for jobs to finish so can't
> > > hit
> > > that.
> > > 
> > > Fix was simply to add a drm_sched_job_cleanup call when unwinding
> > > unfinished mock scheduler jobs from drm_mock_sched_fini, which
> > > happens
> > > before calling drm_sched_fini.
> > > 
> > > That's pretty much how things are expected to be handled AFAIU.
> > > 
> > > > OK. Well.
> > > > 
> > > > Now we've got a philosophical problem:
> > > > 
> > > > We still have to fix those leaks (I'm still working on it, but
> > > > my
> > > > current attempt has failed and I probably fall back to a
> > > > refcount
> > > > solution).
> > > 
> > > You propose to move the responsibility of cleaning up in-flight
> > > jobs
> > > to
> > > the scheduler core?
> > 
> > The scheduler core is already and has always been responsible for
> > cleaning up "in-flight jobs". It does so through
> > backend_ops.free_job(). And we prevent it from cleaning up all jobs
> > by
> > cancelling the work items in drm_sched_fini().
> > 
> > Semantically, the scheduler is the one in charge of the job life
> > times.
> > 
> > As of right now, every single driver is effectively forced to
> > implement
> > the same logic, but they have implemented it in different ways (Xe
> > refcounts the scheduler and only calls drm_sched_fini() once refcnt
> > ==
> > 0, Nouveau maintains a copy of the pending_list, blocking for it to
> > become empty before calling drm_sched_fini())
> 
> Right. And to change it means making ->free_job() for all drivers
> handle 
> different potential job states, while today it only needs to handle 
> finished jobs. Or adding a new vfunc. Or something. It sounds doable
> but 
> a lot of work, not least because there is a lot of drivers.

I know the pain, I'm working on that since ~November.

I plan to propose a new solution "soon"(tm)

> 
> > > > And to see whether the fix actually fixes the leaks, directly
> > > > using
> > > > the
> > > > kunit tests would be handy.
> > > > 
> > > > After all, this is what the kunit tests are there for: show
> > > > what is
> > > > broken within the scheduler. And those leaks definitely
> > > > qualify. Or
> > > > should kunit tests follow the same rules we demand from
> > > > drivers?
> > > > 
> > > > I'd like to hear more opinions about that.
> > > > 
> > > > @Danilo, @Dave, @Sima
> > > > would it be OK if we add kunit tests for the scheduler to DRM
> > > > that
> > > > cause leaks until we can fix them?
> > > 
> > > It is indeed a bit philosophical. I'd say only if there is a 100%
> > > agreement that drm_sched_fini should be able to clean up, or
> > > drive
> > > cleaning up, all driver state. And if we are prepared to handle a
> > > permanently failing test from now to some future date when this
> > > would
> > > be
> > > implemented.
> > > 
> > > I have a similar conundrum with set priority, where I was
> > > contemplating
> > > to add a permanently failing test showing how that does not fully
> > > work,
> > > and then get improved with my deadline scheduling series.
> > > 
> > > On the other side of the argument is the past experience of CI
> > > systems
> > > generally not coping well with permanently failing test.
> > > Eventually
> > > they
> > > succumb to the pressure to remove them due noisy results.
> > > Therefore
> > > other option is to have the mock scheduler adhere to the current
> > > implementation and only change it once the DRM scheduler rules
> > > change.
> > 
> > Can you think of a way, like flags or kconfig options, with which
> > developers such as you and I could "switch the bugs on" for working
> > on
> > those issues?
> 
> We could do that easily I think. Something like:
> 
> config DRM_SCHED_KUNIT_TEST_ASPIRATIONAL
>          bool "Turn on the aspirational mode for DRM scheduler unit 
> tests" if !KUNIT_ALL_TESTS
>          select DRM_SCHED
>          depends on DRM && KUNIT && DRM_SCHED_KUNIT_TEST
>          default KUNIT_ALL_TESTS
>          help
>            Choose this option to make the DRM scheduler unit tests
> test 
> for behaviour which was agreed as a design goal, even if the current 
> implementation can make specific tests fail.
> 
>            Recommended for driver developers only.
> 
>            If in doubt, say "N".

If you can work out something like this, that would be fantastic and
solve all the aforementioned problems

> 
> I can skip the job cleanup based on it and also add some validation
> that 
> the pending list is empty after drm_sched_fini if on.

Sounds handy. That way the developer doesn't even have to use kmemleak.

P.

> 
> Regards,
> 
> Tvrtko
>