On Tue, 2024-10-15 at 14:14 +0100, Tvrtko Ursulin wrote: > > On 15/10/2024 12:38, Philipp Stanner wrote: > > On Tue, 2024-10-15 at 09:12 +0100, Tvrtko Ursulin wrote: > > > > > > On 15/10/2024 08:11, Philipp Stanner wrote: > > > > On Mon, 2024-10-14 at 13:07 +0100, Tvrtko Ursulin wrote: > > > > > > > > > > On 14/10/2024 12:32, Philipp Stanner wrote: > > > > > > Hi, > > > > > > > > > > > > On Mon, 2024-10-14 at 11:46 +0100, Tvrtko Ursulin wrote: > > > > > > > From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx> > > > > > > > > > > > > > > In FIFO mode We can avoid dropping the lock only to > > > > > > > immediately > > > > > > > re- > > > > > > > acquire > > > > > > > by adding a new drm_sched_rq_update_fifo_locked() helper. > > > > > > > > > > > > > > > > > > > Please write detailed commit messages, as described here > > > > > > [1]. > > > > > > 1. Describe the problem: current state and why it's > > > > > > bad. > > > > > > 2. Then, describe in imperative (present tense) form > > > > > > what > > > > > > the > > > > > > commit > > > > > > does about the problem. > > > > > > > > > > Both pieces of info are already there: > > > > > > > > > > 1. Drops the lock to immediately re-acquire it. > > > > > 2. We avoid that by by adding a locked helper. > > > > > > Optionally, in between can be information about why it's > > > > > > solved > > > > > > this > > > > > > way and not another etc. > > > > > > > > > > > > Applies to the other patches, too. > > > > > > > > > > > > > > > > > > [1] > > > > > > https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes > > > > > > > > > > Thanks I am new here and did not know this. > > > > > > > > > > Seriosuly, lets not be too blindly strict about this because > > > > > it > > > > > can > > > > > get > > > > > IMO ridiculous. > > > > > > > > > > One example when I previously accomodated your request is > > > > > patch > > > > > 3/5 > > > > > from > > > > > this series: > > > > > > > > > > """ > > > > > Current kerneldoc for struct drm_sched_rq incompletely > > > > > documents > > > > > what > > > > > fields are protected by the lock. > > > > > > > > > > This is not good because it is misleading. > > > > > > > > > > Lets fix it by listing all the elements which are protected > > > > > by > > > > > the > > > > > lock. > > > > > """ > > > > > > > > > > While this was the original commit text you weren't happy > > > > > with: > > > > > > > > > > """ > > > > > drm/sched: Re-order struct drm_sched_rq members for clarity > > > > > > > > > > Lets re-order the members to make it clear which are > > > > > protected by > > > > > the > > > > > lock > > > > > and at the same time document it via kerneldoc. > > > > > """ > > > > > > > > > > I maintain the original text was passable. > > > > > > > > > > On top, this was just a respin to accomodate the merge > > > > > process. > > > > > All > > > > > approvals were done and dusted couple weeks or so ago so > > > > > asking > > > > > for > > > > > yet > > > > > another respin for such trivial objections is not great. > > > > > > > > I understand that you're unhappy, but please understand the > > > > position > > > > I'm coming from. As you know, since you sent these patches > > > > within a > > > > different series (and, thus, since I reviewed them), I was > > > > trusted > > > > with > > > > co-maintaining this piece of shared infrastructure. > > > > > > > > And since you've worked on it a bit now, I suppose you also > > > > know > > > > that > > > > the GPU Scheduler is arguably in quite a bad shape, has far too > > > > little > > > > documentation, has leaks, maybe race conditions, parts *where > > > > the > > > > locking rules are unclear* and is probably only fully > > > > understood by > > > > a > > > > small hand full of people. I also argue that this is a *very* > > > > complicated piece of software. > > > > > > We already went over that and agreed. Not least I agreed the base > > > is > > > shaky since few years ago. :) > > > > > > Btw if things align, I hope you will at some point see a follow > > > up > > > series from me which makes some significant simplifications and > > > improvements at the same time. > > > > Cool, good to hear! > > (Would be even cooler if simplifications and improvements can be > > delivered through separate patch series to be easier to review > > etc.) > > Yes, when I spot something I pull it ahead and/or standalone when it > makes sense. But it is early days and a big job. > > > > > So I might be or appear to be a bit pedantic, but I'm not doing > > > > that to > > > > terrorize you, but because I want this thing to become well > > > > documented, > > > > understandable, and bisectable. Working towards a canonical, > > > > idiot- > > > > proof commit style is one measure that will help with that. > > > > > > > > I want to offer you the following: I can be more relaxed with > > > > things > > > > universally recognized as trivial (comment changes, struct > > > > member > > > > reordering) – but when something like a lock is touched in any > > > > way, > > > > we > > > > shall document that in the commit message as canonically as > > > > possible, > > > > so someone who's less experienced and just bisected the commit > > > > immediately understands what has been done (or rather: was > > > > supposed > > > > to > > > > be done). > > > > > > So how would you suggest to expand this commit text so it doesn't > > > read > > > too self-repeating? > > > > My issue with this particular commit message is mainly that it > > doesn't > > make it obvious what the patch is supposed to do. So one can make > > it > > quicker and better to review by detailing it a bit more, so the > > reviewer then can compare commit message vs. what the code does. It > > seems to me for example that the actual optimization is being done > > in > > drm_sched_entity_push_job(), and drm_sched_entity_pop_job() had to > > be > > ported, too, for correctness > > "It seems" aka the commit title says so. ;) > > > Another small thing that might be cool is something that makes it a > > bit > > more obvious that this is an optimization, not a fix. > > > > So I would probably write: > > > > "So far, drm_sched_rq_update_fifo() automatically takes > > drm_sched_entity.rq_lock. For DRM_SCHED_POLICY_FIFO, this is > > inefficient because that lock is then taken, released and retaken > > in > > drm_sched_entity_push_job(). > > > > Improve performance by moving the locking out of > > drm_sched_rq_update_fifo()." > > > > Not that much longer but makes it far more obvious what shall be > > achieved where :] > > How about this: > > """ > In FIFO mode (which is the default), both drm_sched_entity_push_job() > and drm_sched_rq_update_fifo(), where the latter calls the former, > are > currently taking and releasing the same entity->rq_lock. > > We can avoid that design inelegance, and also have a miniscule > efficiency improvement on the idle submit path, by introducing a new > drm_sched_rq_update_fifo_locked() helper and pulling up the lock > taking > to its callers. > """ That looks good to me P. > > > (Let me read through the other patches briefly. Then we should be > > good > > with v2 of this series.. or would it be v3 then?) > > Depends how you count. By unique series titles or by fundamental > content. > > Regards, > > Tvrtko >