Re: [PATCH 2/3] drm/amdgpu: Pop jobs from the queue more robustly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 14.02.25 um 11:34 schrieb Tvrtko Ursulin:
>
> On 14/02/2025 10:31, Christian König wrote:
>> Am 14.02.25 um 11:21 schrieb Tvrtko Ursulin:
>>>
>>> Hi Christian,
>>>
>>> On 11/02/2025 10:21, Christian König wrote:
>>>> Am 11.02.25 um 11:08 schrieb Philipp Stanner:
>>>>> On Tue, 2025-02-11 at 09:22 +0100, Christian König wrote:
>>>>>> Am 06.02.25 um 17:40 schrieb Tvrtko Ursulin:
>>>>>>> Replace a copy of DRM scheduler's to_drm_sched_job with a copy of a
>>>>>>> newly
>>>>>>> added __drm_sched_entity_queue_pop.
>>>>>>>
>>>>>>> This allows breaking the hidden dependency that queue_node has to
>>>>>>> be the
>>>>>>> first element in struct drm_sched_job.
>>>>>>>
>>>>>>> A comment is also added with a reference to the mailing list
>>>>>>> discussion
>>>>>>> explaining the copied helper will be removed when the whole broken
>>>>>>> amdgpu_job_stop_all_jobs_on_sched is removed.
>>>>>>>
>>>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxx>
>>>>>>> Cc: Christian König <christian.koenig@xxxxxxx>
>>>>>>> Cc: Danilo Krummrich <dakr@xxxxxxxxxx>
>>>>>>> Cc: Matthew Brost <matthew.brost@xxxxxxxxx>
>>>>>>> Cc: Philipp Stanner <phasta@xxxxxxxxxx>
>>>>>>> Cc: "Zhang, Hawking" <Hawking.Zhang@xxxxxxx>
>>>>>> Reviewed-by: Christian König <christian.koenig@xxxxxxx>
>>>>> I think this v3 has been supplanted by a v4 by now.
>>>>
>>>> I've seen the larger v4 series as well, but at least that patch here looks identical on first glance. So my rb still counts.
>>>
>>> Is it okay for you to merge the whole series (including this single amdgpu patch) via drm-misc?
>>
>> I can do that, but don't want the scheduler maintainer want to pick them up?
>
> Sorry that was some bad and unclear English. :(

Don't worry, I'm not a native speaker either and had only very minimal formal education on it :)

>
> It is as you suggest - what I meant was, is it okay from your point of view that the whole series is merged via drm-misc? I assume Philipp would indeed be the one to merge it, once all patches get r-b-ed.

Ah! Yes of course it. Feel free to go ahead.

Could only be that Alex runs into merge issues, but that is extremely unlikely I think.

Regards,
Christian.

>
> Regards,
>
> Tvrtko
>
>>>>> @Tvrtko: btw, do you create patches with
>>>>> git format-patch -v4 ?
>>>>>
>>>>> That way the v4 label will be included in all patch titles, too, not
>>>>> just the cover letter. That makes searching etc. easier in large
>>>>> inboxes
>>>>>
>>>>> P.
>>>>>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 22 +++++++++++++++++++-
>>>>>>> -- 
>>>>>>>     1 file changed, 19 insertions(+), 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>> index 100f04475943..22cb48bab24d 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>>>>>> @@ -411,8 +411,24 @@ static struct dma_fence *amdgpu_job_run(struct
>>>>>>> drm_sched_job *sched_job)
>>>>>>>         return fence;
>>>>>>>     }
>>>>>>> -#define to_drm_sched_job(sched_job)        \
>>>>>>> -        container_of((sched_job), struct drm_sched_job,
>>>>>>> queue_node)
>>>>>>> +/*
>>>>>>> + * This is a duplicate function from DRM scheduler
>>>>>>> sched_internal.h.
>>>>>>> + * Plan is to remove it when amdgpu_job_stop_all_jobs_on_sched is
>>>>>>> removed, due
>>>>>>> + * latter being incorrect and racy.
>>>>>>> + *
>>>>>>> + * See
>>>>>>> https://lore.kernel.org/amd-gfx/44edde63-7181-44fb- a4f7-94e50514f539@xxxxxxx/
>>>>>>> + */
>>>>>>> +static struct drm_sched_job *
>>>>>>> +__drm_sched_entity_queue_pop(struct drm_sched_entity *entity)
>>>>>>> +{
>>>>>>> +    struct spsc_node *node;
>>>>>>> +
>>>>>>> +    node = spsc_queue_pop(&entity->job_queue);
>>>>>>> +    if (!node)
>>>>>>> +        return NULL;
>>>>>>> +
>>>>>>> +    return container_of(node, struct drm_sched_job,
>>>>>>> queue_node);
>>>>>>> +}
>>>>>>>     void amdgpu_job_stop_all_jobs_on_sched(struct drm_gpu_scheduler
>>>>>>> *sched)
>>>>>>>     {
>>>>>>> @@ -425,7 +441,7 @@ void amdgpu_job_stop_all_jobs_on_sched(struct
>>>>>>> drm_gpu_scheduler *sched)
>>>>>>>             struct drm_sched_rq *rq = sched->sched_rq[i];
>>>>>>>             spin_lock(&rq->lock);
>>>>>>>             list_for_each_entry(s_entity, &rq->entities, list)
>>>>>>> {
>>>>>>> -            while ((s_job =
>>>>>>> to_drm_sched_job(spsc_queue_pop(&s_entity->job_queue)))) {
>>>>>>> +            while ((s_job =
>>>>>>> __drm_sched_entity_queue_pop(s_entity))) {
>>>>>>>                     struct drm_sched_fence *s_fence =
>>>>>>> s_job->s_fence;
>>>>>>>                     dma_fence_signal(&s_fence-
>>>>>>>> scheduled);
>>>>
>>>
>>
>




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux