Am 02.12.19 um 15:43 schrieb Nirmoy:
On 11/29/19 7:42 PM, Christian König wrote:
Am 29.11.19 um 15:29 schrieb Nirmoy:
Hi Christian,
On 11/26/19 10:45 AM, Christian König wrote:
It looks like a start, but there numerous things which needs to be
fixed.
Question number one is: What's that good for? Entities are not the
problem here. The real issue is the fence ring and the rq_list.
The rq_list could actually be made constant since it should never
be changed by the entity. It is only changed for backward
compatibility in drm_sched_entity_set_priority().
So I would start there and cleanup the
drm_sched_entity_set_priority() to actually just set a new constant
rq list instead.
I am missing some context here. Can you please explain bit more? I
looked over and over again but I still don't understand what do you
mean by new constant rq list :/
Ok that needs a bit wider explanation.
The GPU scheduler consists mainly of drm_gpu_scheduler instances.
Each of those instances contain multiple runqueues with different
priorities (5 IIRC).
Now for each entity we give a list of runqueues where this entity can
be served on, e.g. where the jobs which are pushed to the entities
are executed.
The entity itself keeps a copy of that runqueue list because we have
the drm_sched_entity_set_priority() which modifies this runqueue list.
But essentially that is complete overkill, the runqueue lists are
constant for each amdgpu device, e.g. all contexts should use SDMA0
and SDMA1 in the same way.
In other words building the list on runqueues should happen only once
and not for each contexts.
Okay I understand now the real problem. Thanks for detail explanation.
Multiple approach to fix this would be possible. One rather elegant
solution would be to change the rq list into a scheduler instances
list + priority.
Do you mean something like
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 684692a8ed76..ac67f8f098fa 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -81,7 +81,7 @@ enum drm_sched_priority {
struct drm_sched_entity {
struct list_head list;
struct drm_sched_rq *rq;
- struct drm_sched_rq **rq_list;
+ struct drm_gpu_scheduler **sched;
unsigned int num_rq_list;
spinlock_t rq_lock;
Yes, exactly. Problem is that I'm not 100% sure if that really works
with all users of the rq_list.
Regards,
Christian.
This way we would also fix the age old bug that changing the priority
of a context could actually mess up already scheduled jobs.
The alternative I noted before would be to drop
drm_sched_entity_set_priority() or change it into
drm_sched_entity_set_runqueues().
I was working on it but then I got stuck by a "BUG: kernel NULL
pointer dereference, address:" which I am trying to figure out why
Regards,
Christian.
Then we could embed the fences in amdgpu_ctx_entity as dynamic
array at the end of the structure.
And last we can start to dynamic allocate and initialize the
amdgpu_ctx_entity() structures.
Regards,
Christian.
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx