Re: [PATCH 1/5] drm/sched: Optimise drm_sched_entity_push_job

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 15/10/2024 15:00, Philipp Stanner wrote:
On Tue, 2024-10-15 at 14:14 +0100, Tvrtko Ursulin wrote:

On 15/10/2024 12:38, Philipp Stanner wrote:
On Tue, 2024-10-15 at 09:12 +0100, Tvrtko Ursulin wrote:

On 15/10/2024 08:11, Philipp Stanner wrote:
On Mon, 2024-10-14 at 13:07 +0100, Tvrtko Ursulin wrote:

On 14/10/2024 12:32, Philipp Stanner wrote:
Hi,

On Mon, 2024-10-14 at 11:46 +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>

In FIFO mode We can avoid dropping the lock only to
immediately
re-
acquire
by adding a new drm_sched_rq_update_fifo_locked() helper.


Please write detailed commit messages, as described here
[1].
       1. Describe the problem: current state and why it's
bad.
       2. Then, describe in imperative (present tense) form
what
the
commit
          does about the problem.

Both pieces of info are already there:

1. Drops the lock to immediately re-acquire it.
2. We avoid that by by adding a locked helper.
Optionally, in between can be information about why it's
solved
this
way and not another etc.

Applies to the other patches, too.


[1]
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes

Thanks I am new here and did not know this.

Seriosuly, lets not be too blindly strict about this because
it
can
get
IMO ridiculous.

One example when I previously accomodated your request is
patch
3/5
from
this series:

"""
Current kerneldoc for struct drm_sched_rq incompletely
documents
what
fields are protected by the lock.

This is not good because it is misleading.

Lets fix it by listing all the elements which are protected
by
the
lock.
"""

While this was the original commit text you weren't happy
with:

"""
drm/sched: Re-order struct drm_sched_rq members for clarity

Lets re-order the members to make it clear which are
protected by
the
lock
and at the same time document it via kerneldoc.
"""

I maintain the original text was passable.

On top, this was just a respin to accomodate the merge
process.
All
approvals were done and dusted couple weeks or so ago so
asking
for
yet
another respin for such trivial objections is not great.

I understand that you're unhappy, but please understand the
position
I'm coming from. As you know, since you sent these patches
within a
different series (and, thus, since I reviewed them), I was
trusted
with
co-maintaining this piece of shared infrastructure.

And since you've worked on it a bit now, I suppose you also
know
that
the GPU Scheduler is arguably in quite a bad shape, has far too
little
documentation, has leaks, maybe race conditions, parts *where
the
locking rules are unclear* and is probably only fully
understood by
a
small hand full of people. I also argue that this is a *very*
complicated piece of software.

We already went over that and agreed. Not least I agreed the base
is
shaky since few years  ago. :)

Btw if things align, I hope you will at some point see a follow
up
series from me which makes some significant simplifications and
improvements at the same time.

Cool, good to hear!
(Would be even cooler if simplifications and improvements can be
delivered through separate patch series to be easier to review
etc.)

Yes, when I spot something I pull it ahead and/or standalone when it
makes sense. But it is early days and a big job.

So I might be or appear to be a bit pedantic, but I'm not doing
that to
terrorize you, but because I want this thing to become well
documented,
understandable, and bisectable. Working towards a canonical,
idiot-
proof commit style is one measure that will help with that.

I want to offer you the following: I can be more relaxed with
things
universally recognized as trivial (comment changes, struct
member
reordering) – but when something like a lock is touched in any
way,
we
shall document that in the commit message as canonically as
possible,
so someone who's less experienced and just bisected the commit
immediately understands what has been done (or rather: was
supposed
to
be done).

So how would you suggest to expand this commit text so it doesn't
read
too self-repeating?

My issue with this particular commit message is mainly that it
doesn't
make it obvious what the patch is supposed to do. So one can make
it
quicker and better to review by detailing it a bit more, so the
reviewer then can compare commit message vs. what the code does. It
seems to me for example that the actual optimization is being done
in
drm_sched_entity_push_job(), and drm_sched_entity_pop_job() had to
be
ported, too, for correctness

"It seems" aka the commit title says so. ;)

Another small thing that might be cool is something that makes it a
bit
more obvious that this is an optimization, not a fix.

So I would probably write:

"So far, drm_sched_rq_update_fifo() automatically takes
drm_sched_entity.rq_lock. For DRM_SCHED_POLICY_FIFO, this is
inefficient because that lock is then taken, released and retaken
in
drm_sched_entity_push_job().

Improve performance by moving the locking out of
drm_sched_rq_update_fifo()."

Not that much longer but makes it far more obvious what shall be
achieved where :]

How about this:

"""
In FIFO mode (which is the default), both drm_sched_entity_push_job()
and drm_sched_rq_update_fifo(), where the latter calls the former,
are
currently taking and releasing the same entity->rq_lock.

We can avoid that design inelegance, and also have a miniscule
efficiency improvement on the idle submit path, by introducing a new
drm_sched_rq_update_fifo_locked() helper and pulling up the lock
taking
to its callers.
"""

That looks good to me

Cool. So this for 1/5, your text and some tweaks for 4/5, anything else or I can respin with that?

Regards,

Tvrtko


P.



(Let me read through the other patches briefly. Then we should be
good
with v2 of this series.. or would it be v3 then?)

Depends how you count. By unique series titles or by fundamental
content.

Regards,

Tvrtko





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux