Am 06.08.2018 02:13, schrieb Dieter Nützel: > Am 04.08.2018 06:18, schrieb Dieter Nützel: >> Am 04.08.2018 06:12, schrieb Dieter Nützel: >>> Am 04.08.2018 05:27, schrieb Dieter Nützel: >>>> Am 03.08.2018 13:09, schrieb Christian König: >>>>> Am 03.08.2018 um 03:08 schrieb Dieter Nützel: >>>>>> Hello Christian, AMD guys, >>>>>> >>>>>> this one _together_ with these series >>>>>> [PATCH 1/7] drm/amdgpu: use new scheduler load balancing for VMs >>>>>> https://lists.freedesktop.org/archives/amd-gfx/2018-August/024802.html >>>>>> >>>>>> on top of >>>>>> amd-staging-drm-next 53d5f1e4a6d9 >>>>>> >>>>>> freeze whole system (Intel Xeon X3470, RX580) during _first_ mouse >>>>>> move. >>>>>> Same for sddm login or first move in KDE Plasma 5. >>>>>> NO logs so far. - Expected? >>>>> >>>>> Not even remotely, can you double check which patch from the >>>>> "[PATCH >>>>> 1/7] drm/amdgpu: use new scheduler load balancing for VMs" series >>>>> is >>>>> causing the issue? >>>> >>>> Ups, >>>> >>>> _both_ 'series' on top of >>>> >>>> bf1fd52b0632 (origin/amd-staging-drm-next) drm/amdgpu: move gem >>>> definitions into amdgpu_gem header >>>> >>>> works without a hitch. >>>> >>>> But I have new (latest) µcode from openSUSE Tumbleweed. >>>> kernel-firmware-20180730-35.1.src.rpm >>>> >>>> Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> >>> >>> I take this back. >>> >>> Last much longer. >>> Mouse freeze. >>> Could grep a dmesg with remote phone ;-) >>> >>> See the attachment. >>> Dieter >> >> Argh, shi... >> wrong dmesg version. >> >> Should be this one. (For sure...) > > Puh, > > this took some time... > During the 'last' git bisect run => 'first bad commit is' I got next > freeze. > But I could get a new dmesg.log file per remote phone (see attachment). > > git bisect log show this: > > SOURCE/amd-staging-drm-next> git bisect log > git bisect start > # good: [adebfff9c806afe1143d69a0174d4580cd27b23d] drm/scheduler: fix > setting the priorty for entities > git bisect good adebfff9c806afe1143d69a0174d4580cd27b23d > # bad: [43202e67a4e6fcb0e6b773e8eb1ed56e1721e882] drm/amdgpu: use > entity instead of ring for CS > git bisect bad 43202e67a4e6fcb0e6b773e8eb1ed56e1721e882 > # bad: [9867b3a6ddfb73ee3105871541053f8e49949478] drm/amdgpu: use > scheduler load balancing for compute CS > git bisect bad 9867b3a6ddfb73ee3105871541053f8e49949478 > # good: [5d097a4591aa2be16b21adbaa19a8abb76e47ea1] drm/amdgpu: use > scheduler load balancing for SDMA CS > git bisect good 5d097a4591aa2be16b21adbaa19a8abb76e47ea1 > # first bad commit: [9867b3a6ddfb73ee3105871541053f8e49949478] > drm/amdgpu: use scheduler load balancing for compute CS > > git log --oneline > 5d097a4591aa (HEAD, > refs/bisect/good-5d097a4591aa2be16b21adbaa19a8abb76e47ea1) drm/amdgpu: > use scheduler load balancing for SDMA CS > d12ae5172f1f drm/amdgpu: use new scheduler load balancing for VMs > adebfff9c806 > (refs/bisect/good-adebfff9c806afe1143d69a0174d4580cd27b23d) > drm/scheduler: fix setting the priorty for entities > bf1fd52b0632 (origin/amd-staging-drm-next) drm/amdgpu: move gem > definitions into amdgpu_gem header > 5031ae5f9e5c drm/amdgpu: move psp macro into amdgpu_psp header > [-] > > I'm not really sure that > drm/amdgpu: use scheduler load balancing for compute CS > is the offender. > > One step earlier could it be, too. > drm/amdgpu: use scheduler load balancing for SDMA CS > > I'm try running with the SDMA CS patch for the next days. > > If you need more ask! Hello Christian, running the second day _without_ the 2. patch [2/7] drm/amdgpu: use scheduler load balancing for SDMA CS my system is stable, again. To be clear. I've now only #1 applied on top of amd-staging-drm-next. 'This one' is still in. So we should switching the thread. Dieter >>>> >>>>> Thanks, >>>>> Christian. >>>>> >>>>>> >>>>>> Greetings, >>>>>> Dieter >>>>>> >>>>>> Am 01.08.2018 16:27, schrieb Christian König: >>>>>>> Since we now deal with multiple rq we need to update all of them, >>>>>>> not >>>>>>> just the current one. >>>>>>> >>>>>>> Signed-off-by: Christian König <christian.koenig at amd.com> >>>>>>> --- >>>>>>>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c  | 3 +-- >>>>>>>  drivers/gpu/drm/scheduler/gpu_scheduler.c | 36 >>>>>>> ++++++++++++++++++++----------- >>>>>>>  include/drm/gpu_scheduler.h              | 5 ++--- >>>>>>>  3 files changed, 26 insertions(+), 18 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c >>>>>>> index df6965761046..9fcc14e2dfcf 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c >>>>>>> @@ -407,12 +407,11 @@ void amdgpu_ctx_priority_override(struct >>>>>>> amdgpu_ctx *ctx, >>>>>>>     for (i = 0; i < adev->num_rings; i++) { >>>>>>>         ring = adev->rings[i]; >>>>>>>         entity = &ctx->rings[i].entity; >>>>>>> -       rq = &ring->sched.sched_rq[ctx_prio]; >>>>>>> >>>>>>>         if (ring->funcs->type == AMDGPU_RING_TYPE_KIQ) >>>>>>>             continue; >>>>>>> >>>>>>> -       drm_sched_entity_set_rq(entity, rq); >>>>>>> +       drm_sched_entity_set_priority(entity, ctx_prio); >>>>>>>     } >>>>>>>  } >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c >>>>>>> b/drivers/gpu/drm/scheduler/gpu_scheduler.c >>>>>>> index 05dc6ecd4003..85908c7f913e 100644 >>>>>>> --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c >>>>>>> +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c >>>>>>> @@ -419,29 +419,39 @@ static void >>>>>>> drm_sched_entity_clear_dep(struct >>>>>>> dma_fence *f, struct dma_fence_cb >>>>>>>  } >>>>>>> >>>>>>>  /** >>>>>>> - * drm_sched_entity_set_rq - Sets the run queue for an entity >>>>>>> + * drm_sched_entity_set_rq_priority - helper for >>>>>>> drm_sched_entity_set_priority >>>>>>> + */ >>>>>>> +static void drm_sched_entity_set_rq_priority(struct drm_sched_rq >>>>>>> **rq, >>>>>>> +                        enum drm_sched_priority priority) >>>>>>> +{ >>>>>>> +   *rq = &(*rq)->sched->sched_rq[priority]; >>>>>>> +} >>>>>>> + >>>>>>> +/** >>>>>>> + * drm_sched_entity_set_priority - Sets priority of the entity >>>>>>>  * >>>>>>>  * @entity: scheduler entity >>>>>>> - * @rq: scheduler run queue >>>>>>> + * @priority: scheduler priority >>>>>>>  * >>>>>>> - * Sets the run queue for an entity and removes the entity from >>>>>>> the previous >>>>>>> - * run queue in which was present. >>>>>>> + * Update the priority of runqueus used for the entity. >>>>>>>  */ >>>>>>> -void drm_sched_entity_set_rq(struct drm_sched_entity *entity, >>>>>>> -                struct drm_sched_rq *rq) >>>>>>> +void drm_sched_entity_set_priority(struct drm_sched_entity >>>>>>> *entity, >>>>>>> +                  enum drm_sched_priority priority) >>>>>>>  { >>>>>>> -   if (entity->rq == rq) >>>>>>> -       return; >>>>>>> - >>>>>>> -   BUG_ON(!rq); >>>>>>> +   unsigned int i; >>>>>>> >>>>>>>     spin_lock(&entity->rq_lock); >>>>>>> + >>>>>>> +   for (i = 0; i < entity->num_rq_list; ++i) >>>>>>> + drm_sched_entity_set_rq_priority(&entity->rq_list[i], >>>>>>> priority); >>>>>>> + >>>>>>>     drm_sched_rq_remove_entity(entity->rq, entity); >>>>>>> -   entity->rq = rq; >>>>>>> -   drm_sched_rq_add_entity(rq, entity); >>>>>>> +   drm_sched_entity_set_rq_priority(&entity->rq, priority); >>>>>>> +   drm_sched_rq_add_entity(entity->rq, entity); >>>>>>> + >>>>>>>     spin_unlock(&entity->rq_lock); >>>>>>>  } >>>>>>> -EXPORT_SYMBOL(drm_sched_entity_set_rq); >>>>>>> +EXPORT_SYMBOL(drm_sched_entity_set_priority); >>>>>>> >>>>>>>  /** >>>>>>>  * drm_sched_dependency_optimized >>>>>>> diff --git a/include/drm/gpu_scheduler.h >>>>>>> b/include/drm/gpu_scheduler.h >>>>>>> index 0c4cfe689d4c..22c0f88f7d8f 100644 >>>>>>> --- a/include/drm/gpu_scheduler.h >>>>>>> +++ b/include/drm/gpu_scheduler.h >>>>>>> @@ -298,9 +298,8 @@ void drm_sched_entity_fini(struct >>>>>>> drm_sched_entity *entity); >>>>>>>  void drm_sched_entity_destroy(struct drm_sched_entity *entity); >>>>>>>  void drm_sched_entity_push_job(struct drm_sched_job *sched_job, >>>>>>>                    struct drm_sched_entity *entity); >>>>>>> -void drm_sched_entity_set_rq(struct drm_sched_entity *entity, >>>>>>> -                struct drm_sched_rq *rq); >>>>>>> - >>>>>>> +void drm_sched_entity_set_priority(struct drm_sched_entity >>>>>>> *entity, >>>>>>> +                  enum drm_sched_priority priority); >>>>>>>  struct drm_sched_fence *drm_sched_fence_create( >>>>>>>     struct drm_sched_entity *s_entity, void *owner); >>>>>>>  void drm_sched_fence_scheduled(struct drm_sched_fence *fence); >>>> _______________________________________________ >>>> dri-devel mailing list >>>> dri-devel at lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >>> >>> _______________________________________________ >>> dri-devel mailing list >>> dri-devel at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/dri-devel >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel