Re: Why does kgd2kfd_interrupt() have to schedule work on a specific CPU?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This was to fix application long event wait latency, when app shader generates lots of event interrupts in short period, the scheduled work no chance to execute on the same CPU core, this causes event cannot post/return to app thread which are waiting the event. To schedule work on the core of same NUMA node is to optimize cache usage in general.

Regards

Philip

On 2023-06-27 11:42, Alex Deucher wrote:
+Felix, Philip

On Tue, Jun 27, 2023 at 4:42 AM Philipp Stanner <pstanner@xxxxxxxxxx> wrote:
Hello folks,

I'm currently trying to learn more about DRM and discovered the
following code sequence:


drivers/gpu/drm/amd/amdkfd/kfd_device.c, Line 824 on 6.4-rc7

static inline void kfd_queue_work(struct workqueue_struct *wq,
                                  struct work_struct *work)
{
        int cpu, new_cpu;

        cpu = new_cpu = smp_processor_id();
        do {
                new_cpu = cpumask_next(new_cpu, cpu_online_mask) % nr_cpu_ids;
                if (cpu_to_node(new_cpu) == numa_node_id())
                        break;
        } while (cpu != new_cpu);

        queue_work_on(new_cpu, wq, work);
}

/* This is called directly from KGD at ISR. */
void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
{
        uint32_t patched_ihre[KFD_MAX_RING_ENTRY_SIZE];
        bool is_patched = false;
        unsigned long flags;

        if (!kfd->init_complete)
                return;

        if (kfd->device_info.ih_ring_entry_size > sizeof(patched_ihre)) {
                dev_err_once(kfd_device, "Ring entry too small\n");
                return;
        }

        spin_lock_irqsave(&kfd->interrupt_lock, flags);

        if (kfd->interrupts_active
            && interrupt_is_wanted(kfd, ih_ring_entry,
                                   patched_ihre, &is_patched)
            && enqueue_ih_ring_entry(kfd,
                                     is_patched ? patched_ihre : ih_ring_entry))
                kfd_queue_work(kfd->ih_wq, &kfd->interrupt_work);

        spin_unlock_irqrestore(&kfd->interrupt_lock, flags);
}


These functions seem to be exclusively invoked by amdgpu_irq_dispatch()
in amdgpu_irq.c
At first glance it seems to me that it's just a typical scenario taking
place here: Interrupt arises, interrupt submits work to wq, then jumps
back to sleep / former process execution context again.

What I don't understand is why it's apparently important to schedule
the work on a particular CPU.

It seems that the do-while in kfd_queue_work() is searching for a CPU
within the same NUMA-Node. Thus I suspect that this is done because
either
a) performance requires it or
b) the work-function needs access to something that's only available
   within the same node.

I suspect there is an interrupt-related reason why that particular work
should be enqueued on a specific CPU. Just by reading the code alone I
can't really figure out why precisely that's necessary, though.

Does someone have any hints for me? :)

Cheers,
Philipp



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux