Re: [PATCH v4 1/5] drm: Move some options to separate new Kconfig

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx> · Mon, 10 Mar 2025 11:54:23 +0000

On 10/03/2025 11:11, Philipp Stanner wrote:
On Mon, 2025-03-10 at 09:55 +0000, Tvrtko Ursulin wrote:

On 07/03/2025 18:06, Philipp Stanner wrote:
On Fri, 2025-03-07 at 16:59 +0000, Tvrtko Ursulin wrote:

On 07/03/2025 13:41, Philipp Stanner wrote:
Hi,

You forgot to put folks in CC as recipents for the cover letter
:(

On Thu, 2025-03-06 at 17:05 +0000, Tvrtko Ursulin wrote:
Move some options out into a new debug specific kconfig file
in
order
to
make things a bit cleaner.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>
Cc: Christian König <christian.koenig@xxxxxxx>
Cc: Danilo Krummrich <dakr@xxxxxxxxxx>
Cc: Matthew Brost <matthew.brost@xxxxxxxxx>
Cc: Philipp Stanner <phasta@xxxxxxxxxx>

We all have our individual work flows, so don't take this as
lecturing
or anything – I just suspect that I was forgotten in the cover
letter
because you Cc people by hand in the individual patches.

What I do is that I run get_maintainer and then put the
individuals
listed there into the --to= field. That sends the entire series
to
all
of them.

Only sometimes, when there's a huge list of recipents or when
the
patches of a series are very independent, I deviate from that
rule.

JFYI

Notice it was there in v3, I just omitted to paste it this time.

Anyways, we have a bigger problem about the entire series. I
now
tested
again with the same setup as yesterday and the faults are
indeed
gone,
so that's good.

But to be sure I then did run kmemleak and got a list of leaks
that
is
more than 2000 lines long.

There is this comment for drm_sched_fini which ends with:

"""
...
    * This stops submission of new jobs to the hardware through
    * drm_sched_backend_ops.run_job(). Consequently,
drm_sched_backend_ops.free_job()
    * will not be called for all jobs still in
drm_gpu_scheduler.pending_list.
    * There is no solution for this currently. Thus, it is up to
the
driver to make
    * sure that:
    *
    *  a) drm_sched_fini() is only called after for all submitted
jobs
    *     drm_sched_backend_ops.free_job() has been called or that
    *  b) the jobs for which drm_sched_backend_ops.free_job() has
not
been
called
    *     after drm_sched_fini() ran are freed manually.
    *

    * FIXME: Take care of the above problem and prevent this
function
from
leaking
    * the jobs in drm_gpu_scheduler.pending_list under any
circumstances.
"""

I got bitten by that. Keep forgetting how fragile the thing is..
:(

argh damn, those are *all* from the pending_list?!

Right, all leaks I saw were from the drm_sched_basic_entity_cleanup
test. All other tests actually wait for jobs to finish so can't hit
that.

Fix was simply to add a drm_sched_job_cleanup call when unwinding
unfinished mock scheduler jobs from drm_mock_sched_fini, which
happens
before calling drm_sched_fini.

That's pretty much how things are expected to be handled AFAIU.

OK. Well.

Now we've got a philosophical problem:

We still have to fix those leaks (I'm still working on it, but my
current attempt has failed and I probably fall back to a refcount
solution).

You propose to move the responsibility of cleaning up in-flight jobs
to
the scheduler core?

The scheduler core is already and has always been responsible for
cleaning up "in-flight jobs". It does so through
backend_ops.free_job(). And we prevent it from cleaning up all jobs by
cancelling the work items in drm_sched_fini().

Semantically, the scheduler is the one in charge of the job life times.

As of right now, every single driver is effectively forced to implement
the same logic, but they have implemented it in different ways (Xe
refcounts the scheduler and only calls drm_sched_fini() once refcnt ==
0, Nouveau maintains a copy of the pending_list, blocking for it to
become empty before calling drm_sched_fini())

Right. And to change it means making ->free_job() for all drivers handle 
different potential job states, while today it only needs to handle 
finished jobs. Or adding a new vfunc. Or something. It sounds doable but 
a lot of work, not least because there is a lot of drivers.

And to see whether the fix actually fixes the leaks, directly using
the
kunit tests would be handy.

After all, this is what the kunit tests are there for: show what is
broken within the scheduler. And those leaks definitely qualify. Or
should kunit tests follow the same rules we demand from drivers?

I'd like to hear more opinions about that.

@Danilo, @Dave, @Sima
would it be OK if we add kunit tests for the scheduler to DRM that
cause leaks until we can fix them?

It is indeed a bit philosophical. I'd say only if there is a 100%
agreement that drm_sched_fini should be able to clean up, or drive
cleaning up, all driver state. And if we are prepared to handle a
permanently failing test from now to some future date when this would
be
implemented.

I have a similar conundrum with set priority, where I was
contemplating
to add a permanently failing test showing how that does not fully
work,
and then get improved with my deadline scheduling series.

On the other side of the argument is the past experience of CI
systems
generally not coping well with permanently failing test. Eventually
they
succumb to the pressure to remove them due noisy results. Therefore
other option is to have the mock scheduler adhere to the current
implementation and only change it once the DRM scheduler rules
change.

Can you think of a way, like flags or kconfig options, with which
developers such as you and I could "switch the bugs on" for working on
those issues?

We could do that easily I think. Something like:

config DRM_SCHED_KUNIT_TEST_ASPIRATIONAL
        bool "Turn on the aspirational mode for DRM scheduler unit 
tests" if !KUNIT_ALL_TESTS
        select DRM_SCHED
        depends on DRM && KUNIT && DRM_SCHED_KUNIT_TEST
        default KUNIT_ALL_TESTS
        help
          Choose this option to make the DRM scheduler unit tests test 
for behaviour which was agreed as a design goal, even if the current 
implementation can make specific tests fail.

          Recommended for driver developers only.

          If in doubt, say "N".

I can skip the job cleanup based on it and also add some validation that 
the pending list is empty after drm_sched_fini if on.

Regards,

Tvrtko