[PATCH 14/25] drm/amdkfd: Populate DRM render device minor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 09.02.2018 um 21:31 schrieb Felix Kuehling:
> On 2018-02-09 07:34 AM, Christian König wrote:
>> I really wonder if sharing the GPUVM instance from a render node file
>> descriptor wouldn't be easier.
>>
>> You could just use the existing IOCTL for allocating and mapping
>> memory on a render node and then give the prepared fd to kfd to use.
> In amd-kfd-staging we have an ioctl to import a DMABuf into the KFD VM
> for interoperability between graphics and compute. I'm planning to
> upstream that later.

Could you move that around and upstream it first?

> It still depends on the KFD calls for managing the
> GPU mappings. The KFD GPU mapping calls need to interact with the
> eviction logic. TLB flushing involves the HIQ/KIQ for figuring out the
> VMID/PASID mapping, which is managed asynchronously by the HWS, not
> amdgpu_ids.c. User pointers need a different MMU notifier. AQL queue
> buffers on some GPUs need a double-mapping workaround for a HW
> wraparound bug. I could go on. So this is not easy to retrofit into
> amdgpu_gem and amdgpu_cs. Also, KFD VMs are created differently
> (AMDGPU_VM_CONTEXT_COMPUTE, amdkfd_vm wrapper structure).

Well, that is actually the reason why I'm asking about it.

First of all kernel development is not use case driver, e.g. we should 
not implement something because userspace needs it in a specific way, 
but rather because the hardware supports it in a specific way.

This means that in theory we should have the fixes for the HW problems 
in both interfaces. That doesn't make sense at the moment because we 
don't support user space queue through the render node file descriptor, 
but people are starting to ask for that as well.

> What's more, buffers allocated through amdgpu_gem calls create GEM
> objects that we don't need.

I see that as an advantage rather than a problem, cause it fixes a 
couple of problems with the KFD where the address space of the inode is 
not managed correctly as far as I can see.

That implementation issues never caused problems right now because you 
never tried to unmap doorbells. But with the new eviction code that is 
about to change, isn't it?

> And exporting and importing DMABufs adds
> more overhead and a potential to run into the process file-descriptor
> limit (maybe the FD could be discarded after importing).

Closing the file descriptor is a must have after importing, so that 
isn't an issue.

But I agree that exporting and reimporting the file descriptor adds some 
additional overhead.

> I honestly thought about whether this would be feasible when we
> implemented the CPU mapping through the DRM FD. But it would be nothing
> short of a complete redesign of the KFD memory management code. It would
> be months of work, more instability, for questionable benefits. I don't
> think it would be in the interest of end users and customers.

Yeah, agree on that.

> I just thought of a slightly different approach I would consider more
> realistic, without having thought through all the details: Adding
> KFD-specific memory management ioctls to the amdgpu device. Basically
> call amdgpu_amdkfd_gpuvm functions from amdgpu ioctl functions instead
> of KFD ioctl functions. But we'd still have KFD ioctls for other things,
> and the new amdgpu ioctls would be KFD-specific and not useful for
> graphics applications. It's probably still several weeks of work, but
> shouldn't have major teething issues because the internal logic and
> functionality would be basically unchanged. It would just move the
> ioctls from one device to another.

My thinking went into a similar direction. But instead of exposing the 
KFD IOCTLs through the DRM FD, I would let the KFD import a DRM FD.

And then use the DRM FD in the KFD for things like the buffer provider 
of a device, e.g. no separate IDR for BOs in the KFD but rather a 
reference to the DRM FD.

We can still manage the VM through KFD IOCTL, but the BOs and the VM are 
actually provided by the DRM FD.

Regards,
Christian.

> Regards,
>    Felix
>
>> Regards,
>> Christian.
>>
>> Am 07.02.2018 um 02:32 schrieb Felix Kuehling:
>>> From: Oak Zeng <Oak.Zeng at amd.com>
>>>
>>> Populate DRM render device minor in kfd topology
>>>
>>> Signed-off-by: Oak Zeng <Oak.Zeng at amd.com>
>>> Signed-off-by: Felix Kuehling <Felix.Kuehling at amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 4 ++++
>>>    drivers/gpu/drm/amd/amdkfd/kfd_topology.h | 1 +
>>>    2 files changed, 5 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> index 2506155..ac28abc 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
>>> @@ -441,6 +441,8 @@ static ssize_t node_show(struct kobject *kobj,
>>> struct attribute *attr,
>>>                dev->node_props.device_id);
>>>        sysfs_show_32bit_prop(buffer, "location_id",
>>>                dev->node_props.location_id);
>>> +    sysfs_show_32bit_prop(buffer, "drm_render_minor",
>>> +            dev->node_props.drm_render_minor);
>>>          if (dev->gpu) {
>>>            log_max_watch_addr =
>>> @@ -1214,6 +1216,8 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>>>            dev->gpu->kfd2kgd->get_max_engine_clock_in_mhz(dev->gpu->kgd);
>>>        dev->node_props.max_engine_clk_ccompute =
>>>            cpufreq_quick_get_max(0) / 1000;
>>> +    dev->node_props.drm_render_minor =
>>> +        gpu->shared_resources.drm_render_minor;
>>>          kfd_fill_mem_clk_max_info(dev);
>>>        kfd_fill_iolink_non_crat_info(dev);
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> index c0be2be..eb54cfc 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.h
>>> @@ -71,6 +71,7 @@ struct kfd_node_properties {
>>>        uint32_t location_id;
>>>        uint32_t max_engine_clk_fcompute;
>>>        uint32_t max_engine_clk_ccompute;
>>> +    int32_t  drm_render_minor;
>>>        uint16_t marketing_name[KFD_TOPOLOGY_PUBLIC_NAME_SIZE];
>>>    };
>>>    
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux