On 2020-03-31 08:46, Nirmoy wrote: > > On 3/31/20 3:01 AM, Luben Tuikov wrote: >> This patch seems to be using DOS line-endings. > > > Strange, I don't see that in my local patch file. > Not sure why "git am" complained about DOS endings the first time I downloaded it. Second time was fine. [snip]>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>> index 29f0a410091b..27abbdc603dd 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h >>> @@ -721,6 +721,11 @@ struct amd_powerplay { >>> const struct amd_pm_funcs *pp_funcs; >>> }; >>> >>> +struct amdgpu_sched { >>> + uint32_t num_scheds; >>> + struct drm_gpu_scheduler *sched[HWIP_MAX_INSTANCE]; >>> +}; >>> + >>> #define AMDGPU_RESET_MAGIC_NUM 64 >>> #define AMDGPU_MAX_DF_PERFMONS 4 >>> struct amdgpu_device { >>> @@ -858,6 +863,8 @@ struct amdgpu_device { >>> struct amdgpu_ring *rings[AMDGPU_MAX_RINGS]; >>> bool ib_pool_ready; >>> struct amdgpu_sa_manager ring_tmp_bo[AMDGPU_IB_POOL_MAX]; >>> + /* drm scheduler list */ >>> + struct amdgpu_sched gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX]; >> That's a 2-dimensional array of "struct amdgpu_sched". >> I think that the comment should be removed, or at least >> not say "drm scheduler list". (I can see the structure >> definition above.) > > > Yes I should remove it. > > >> If this is the path you want to go, consider removing >> "num_scheds" and creating a three dimensional array, >> which would really essentialize the direction you want >> to go: >> >> struct drm_gpu_scheduler *gpu_sched[AMDGPU_HW_IP_NUM][AMDGPU_RING_PRIO_MAX][HWIP_MAX_INSTANCE]; >> >> Now that this architecture is stripped down to its essentials, >> perhaps we can see some optimizations...? > > > If you mean whether we should see any performance improvement then imo > we may not see much > > difference as we are using pretty much same number of memory access plus > now we have extra cost of array_index_nospec(). > > Also this is not hot code path. We do only 1 > amdgpu_ctx_init_entity()/HW_IP/Context. No, this has nothing to do with "performance". It's all about architecture and design. You seem to have array-array-struct with array and int, and it seems much cleaner to just have array-array-array. I think you don't need to break the chain with struct of int and array, just as I described in my comment below which you snipped without addressing it. If you're not going to address a comment, don't delete it, leave it for others to see that it hasn't been addressed. Only snip previously addressed and resolved comments and previously seen patch info, like diffstat/etc. >> Also consider that since you're creating an array of pointers, >> you don't necessarily need to know their count. Your hot-path >> algorithms should not need to know it. If you need to print >> their count, say in sysfs, then you can count them up on >> behalf of the user-space process cat-ing the sysfs entry. >> [snip] >>> + >>> + /* set to default prio if sched_list is NULL */ >>> + if (!adev->gpu_sched[hw_ip][hw_prio].num_scheds) >>> + hw_prio = AMDGPU_RING_PRIO_DEFAULT; >> That comment is a bit confusing--it talks about a list >> not being NULL, while the conditional implicitly checks >> against 0. > > > Yes, this is wrong, will remove it. > > <snip> > I wish you hadn't snipped my comment here, but address it one way or the other. It is: I'd much rather that integer comparison be performed against integers, as opposed to using logical NOT operator (which is implicit in comparing against 0), i.e., if (adev->gpu_sched[hw_ip][hw_prio].num_scheds == 0) Also, architecturally, there seems to be informational redundancy, in keeping an integer count and list of objects at the same time, as the above if-conditional exposes: the comment talks about a list and NULL but the if-conditional implicitly checks for 0. Perhaps, we don't need "num_scheds" and you can just check if the index is NULL and assign PRIO_DEFAULT. >> @@ -258,6 +272,12 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring, >> ring->priority = DRM_SCHED_PRIORITY_NORMAL; >> mutex_init(&ring->priority_mutex); >> >> + if (ring->funcs->type != AMDGPU_RING_TYPE_KIQ) { >> + hw_ip = amdgpu_ring_type_to_drm_hw_ip[ring->funcs->type]; >> + num_sched = &adev->gpu_sched[hw_ip][hw_prio].num_scheds; >> + adev->gpu_sched[hw_ip][hw_prio].sched[(*num_sched)++] = &ring->sched; >> + } >> This seems unnecessarily complicated. Perhaps we can remove >> "num_scheds", as recommended above, and keep a running pointer >> while initialization is being done...? > > > What do you mean by running pointer ? A "running pointer" is a local pointer you're using temporarily to traverse memory. If you remove the "num_scheds", as noted in my previous comment, then you can use a running pointer to contain the next entry you'd initialize. Regards, Luben _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx