[PATCH 14/25] drm/amdkfd: Populate DRM render device minor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2018-02-13 05:46 AM, Christian König wrote:
> Am 12.02.2018 um 17:57 schrieb Felix Kuehling:
>>> [SNIP]
>>> I see that as an advantage rather than a problem, cause it fixes a
>>> couple of problems with the KFD where the address space of the inode
>>> is not managed correctly as far as I can see.
>> I don't think GEM is involved in the management of address space. That's
>> handled inside TTM and drm_vma_manager, both of which we are using. We
>> have tested CPU mappings with evictions and this is working correctly
>> now.
>
> Ok, correct. I was thinking about that as one large functionality.
> When you see it like this GEM basically only implements looking up BOs
> and reference counting them, but both are still useful features.
>
>> We had problems with this before, when we were CPU-mapping our buffers
>> through the KFD device FD.
>>
>>> That implementation issues never caused problems right now because you
>>> never tried to unmap doorbells. But with the new eviction code that is
>>> about to change, isn't it?
>> I don't understand this comment. What does this have to do with
>> doorbells?
>
> I was assuming that doorbells are now unmapped to figure out when to
> start a process again.

Restart is done based on a timer.

>
> See what I'm missing is how does userspace figures out which address
> to use to map the doorbell? E.g. I don't see a call to
> drm_vma_offset_manager_init() or something like that.

Each process gets a whole page of the doorbell aperture assigned to it.
The assumption is that amdgpu only uses the first page of the doorbell
aperture, so KFD uses all the rest. On GFX8 and before, the queue ID is
used as the offset into the doorbell page. On GFX9 the hardware does
some engine-specific doorbell routing, so we added another layer of
doorbell management that's decoupled from the queue ID.

Either way, an entire doorbell page gets mapped into user mode and user
mode knows the offset of the doorbells for specific queues. The mapping
is currently handled by kfd_mmap in kfd_chardev.c. A later patch will
add the ability to map doorbells into GPUVM address space so that GPUs
can dispatch work to themselves (or each other) by creating a special
doorbell BO that can be mapped both to the GPU and the CPU. It works by
creating an amdgpu_bo with an SG describing the doorbell page of the
process.

>
>> Doorbells are never unmapped. When queues are evicted doorbells remain
>> mapped to the process. User mode can continue writing to doorbells,
>> though they won't have any immediate effect while the corresponding
>> queues are unmapped from the HQDs.
>
> Do you simply assume that after evicting a process it always needs to
> be restarted without checking if it actually does something? Or how
> does that work?

Exactly. With later addition of GPU self-dispatch a page-fault based
mechanism wouldn't work any more. We have to restart the queues blindly
with a timer. See evict_process_worker, which schedules the restore with
a delayed worker.

The user mode queue ABI specifies that user mode update both the
doorbell and a WPTR in memory. When we restart queues we (or the CP
firmware) use the WPTR to make sure we catch up with any work that was
submitted while the queues were unmapped.

Regards,
  Felix

>
> Regards,
> Christian.



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux