Hi, this is your Linux kernel regression tracker speaking. CCing Dave and Daniel. On 15.12.21 23:32, Ben Skeggs wrote: > On Tue, 14 Dec 2021 at 19:19, Christian König > <ckoenig.leichtzumerken@xxxxxxxxx> wrote: >> >> Am 11.12.21 um 10:59 schrieb Stefan Fritsch: >>> On 09.12.21 11:23, Christian König wrote: >>>> Always waiting for the exclusive fence resulted on some performance >>>> regressions. So try to wait for the shared fences first, then the >>>> exclusive fence should always be signaled already. >>>> >>>> v2: fix incorrectly placed "(", add some comment why we do this. >>>> >>>> Signed-off-by: Christian König <christian.koenig@xxxxxxx> >>> >>> Tested-by: Stefan Fritsch <sf@xxxxxxxxxxx> >> >> Thanks. >> >>> >>> Please also add a cc for linux-stable, so that this is fixed in 5.15.x >> >> Sure, but I still need some acked-by or rb from one of the Nouveau guys. >> So gentle ping on that. > Acked-by: Ben Skeggs <bskeggs@xxxxxxxxxx> What's the status of this patch? I checked a few git trees, but either it's not there or it missed it. Reminder, it's a regression already introduced in v5.15, hence all users of the current stable kernel are affected by it, so it would be nice to get the fix on its way now that Ben acked it and Dan tested it. Ciao, Thorsten P.S.: As a Linux kernel regression tracker I'm getting a lot of reports on my table. I can only look briefly into most of them. Unfortunately therefore I sometimes will get things wrong or miss something important. I hope that's not the case here; if you think it is, don't hesitate to tell me about it in a public reply. That's in everyone's interest, as what I wrote above might be misleading to everyone reading this; any suggestion I gave thus might sent someone reading this down the wrong rabbit hole, which none of us wants. BTW, I have no personal interest in this issue, which is tracked using regzbot, my Linux kernel regression tracking bot (https://linux-regtracking.leemhuis.info/regzbot/). I'm only posting this mail to get things rolling again and hence don't need to be CC on all further activities wrt to this regression. #regzbot poke >>>> --- >>>> drivers/gpu/drm/nouveau/nouveau_fence.c | 28 +++++++++++++------------ >>>> 1 file changed, 15 insertions(+), 13 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c >>>> b/drivers/gpu/drm/nouveau/nouveau_fence.c >>>> index 05d0b3eb3690..0ae416aa76dc 100644 >>>> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c >>>> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c >>>> @@ -353,15 +353,22 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, >>>> struct nouveau_channel *chan, bool e >>>> if (ret) >>>> return ret; >>>> - } >>>> - fobj = dma_resv_shared_list(resv); >>>> - fence = dma_resv_excl_fence(resv); >>>> + fobj = NULL; >>>> + } else { >>>> + fobj = dma_resv_shared_list(resv); >>>> + } >>>> - if (fence) { >>>> + /* Waiting for the exclusive fence first causes performance >>>> regressions >>>> + * under some circumstances. So manually wait for the shared >>>> ones first. >>>> + */ >>>> + for (i = 0; i < (fobj ? fobj->shared_count : 0) && !ret; ++i) { >>>> struct nouveau_channel *prev = NULL; >>>> bool must_wait = true; >>>> + fence = rcu_dereference_protected(fobj->shared[i], >>>> + dma_resv_held(resv)); >>>> + >>>> f = nouveau_local_fence(fence, chan->drm); >>>> if (f) { >>>> rcu_read_lock(); >>>> @@ -373,20 +380,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, >>>> struct nouveau_channel *chan, bool e >>>> if (must_wait) >>>> ret = dma_fence_wait(fence, intr); >>>> - >>>> - return ret; >>>> } >>>> - if (!exclusive || !fobj) >>>> - return ret; >>>> - >>>> - for (i = 0; i < fobj->shared_count && !ret; ++i) { >>>> + fence = dma_resv_excl_fence(resv); >>>> + if (fence) { >>>> struct nouveau_channel *prev = NULL; >>>> bool must_wait = true; >>>> - fence = rcu_dereference_protected(fobj->shared[i], >>>> - dma_resv_held(resv)); >>>> - >>>> f = nouveau_local_fence(fence, chan->drm); >>>> if (f) { >>>> rcu_read_lock(); >>>> @@ -398,6 +398,8 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, >>>> struct nouveau_channel *chan, bool e >>>> if (must_wait) >>>> ret = dma_fence_wait(fence, intr); >>>> + >>>> + return ret; >>>> } >>>> return ret; >>