On 12/07/2023 18:01, Felix Kuehling wrote:
Ah, I meant userq_eviction_lock (not fence), as I was curious if we just want to keep the GPUVM from eviction, a lock can also do that for us.Am 2023-07-12 um 11:55 schrieb Shashank Sharma:On 11/07/2023 21:51, Felix Kuehling wrote:I was planning to do something like this keep userqueue objects from eviction:On 2023-07-06 09:39, Christian König wrote:Am 06.07.23 um 15:37 schrieb Shashank Sharma:On 06/07/2023 15:22, Christian König wrote:I was a bit curious about this, the scope of this MQD object is kernel internal and used for queue mapping only, userspace doesn't know much about it. Do you still think we should not create a kernel object for it ?Am 06.07.23 um 14:35 schrieb Shashank Sharma:A Memory queue descriptor (MQD) of a userqueue defines it in the hw's context. As MQD format can vary between different graphics IPs, we need gfx GEN specific handlers to create MQDs. This patch: - Introduces MQD handler functions for the usermode queues. - Adds new functions to create and destroy userqueue MQD for GFX-GEN-11 IP V1: Worked on review comments from Alex: - Make MQD functions GEN and IP specific V2: Worked on review comments from Alex: - Reuse the existing adev->mqd[ip] for MQD creation - Formatting and arrangement of code V3: - Integration with doorbell manager V4: Review comments addressed: - Do not create a new file for userq, reuse gfx_v11_0.c (Alex) - Align name of structure members (Luben)- Don't break up the Cc tag list and the Sob tag list in commitmessage (Luben) V5: - No need to reserve the bo for MQD (Christian). - Some more changes to support IP specific MQD creation. Cc: Alex Deucher <alexander.deucher@xxxxxxx> Cc: Christian Koenig <christian.koenig@xxxxxxx> Signed-off-by: Shashank Sharma <shashank.sharma@xxxxxxx> Signed-off-by: Arvind Yadav <arvind.yadav@xxxxxxx> --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 16 ++++drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 73 +++++++++++++++++++.../gpu/drm/amd/include/amdgpu_userqueue.h | 7 ++ 3 files changed, 96 insertions(+)diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.cindex e37b5da5a0d0..bb774144c372 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c@@ -134,12 +134,28 @@ int amdgpu_userq_ioctl(struct drm_device *dev, void *data,return r; } +extern const struct amdgpu_userq_funcs userq_gfx_v11_funcs; + +static void +amdgpu_userqueue_setup_gfx(struct amdgpu_userq_mgr *uq_mgr) +{ + int maj; + struct amdgpu_device *adev = uq_mgr->adev; + uint32_t version = adev->ip_versions[GC_HWIP][0]; + + /* We support usermode queue only for GFX V11 as of now */ + maj = IP_VERSION_MAJ(version); + if (maj == 11)+ uq_mgr->userq_funcs[AMDGPU_HW_IP_GFX] = &userq_gfx_v11_funcs;+} +int amdgpu_userq_mgr_init(struct amdgpu_userq_mgr *userq_mgr, struct amdgpu_device *adev){ mutex_init(&userq_mgr->userq_mutex); idr_init_base(&userq_mgr->userq_idr, 1); userq_mgr->adev = adev; + amdgpu_userqueue_setup_gfx(userq_mgr); return 0; }diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.cindex c4940b6ea1c4..e76e1b86b434 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -30,6 +30,7 @@ #include "amdgpu_psp.h" #include "amdgpu_smu.h" #include "amdgpu_atomfirmware.h" +#include "amdgpu_userqueue.h" #include "imu_v11_0.h" #include "soc21.h" #include "nvd.h"@@ -6486,3 +6487,75 @@ const struct amdgpu_ip_block_version gfx_v11_0_ip_block =.rev = 0, .funcs = &gfx_v11_0_ip_funcs, }; ++static int gfx_v11_0_userq_mqd_create(struct amdgpu_userq_mgr *uq_mgr,+ struct drm_amdgpu_userq_in *args_in, + struct amdgpu_usermode_queue *queue) +{ + struct amdgpu_device *adev = uq_mgr->adev;+ struct amdgpu_mqd *mqd_gfx_generic = &adev->mqds[AMDGPU_HW_IP_GFX];+ struct drm_amdgpu_userq_mqd_gfx_v11_0 mqd_user; + struct amdgpu_mqd_prop userq_props; + int r; + + /* Incoming MQD parameters from userspace to be saved here */ + memset(&mqd_user, 0, sizeof(mqd_user)); ++ /* Structure to initialize MQD for userqueue using generic MQD init function */+ memset(&userq_props, 0, sizeof(userq_props)); ++ if (args_in->mqd_size != sizeof(struct drm_amdgpu_userq_mqd_gfx_v11_0)) {+ DRM_ERROR("MQD size mismatch\n"); + return -EINVAL; + } ++ if (copy_from_user(&mqd_user, u64_to_user_ptr(args_in->mqd), args_in->mqd_size)) {+ DRM_ERROR("Failed to get user MQD\n"); + return -EFAULT; + } + + /* Create BO for actual Userqueue MQD now */+ r = amdgpu_bo_create_kernel(adev, mqd_gfx_generic->mqd_size, PAGE_SIZE,+ AMDGPU_GEM_DOMAIN_GTT, + &queue->mqd.obj, + &queue->mqd.gpu_addr, + &queue->mqd.cpu_ptr); + if (r) { + DRM_ERROR("Failed to allocate BO for userqueue (%d)", r); + return -ENOMEM; + }Using amdgpu_bo_create_kernel() for the MQD is most likely not a good idea in the long term, but should work for now.Well we should use a kernel BO. But amdgpu_bo_create_kernel() not only creates a kernel BO but also pins it! And that is problematic because it allows userspace to do a deny of service attach on the kernel module.What we need is an eviction fence, e.g. what KFD is already using. Then the BO is created similar to how VM page tables are created, maybe even using the same reservation object.KFD doesn't currently use eviction fences on MQDs. Those are pinned. I guess you could treat the MQDs more like we treat page tables. They are allocated in kernel mode but protected with fences rather than pinning.I'm not sure if MES needs to be able to access MQDs while queues are not mapped. If that's the case, pinning can't be avoided.- Add a new mutex in queue struct (say userq_eviction_fence)- lock this while mapping the queue, and unlock it while unmapping of the queue- add checks in amdgpu_vm_evictable() to add a mutex_trylock(userq_eviction_fence)Does it sound like what we want to do here or should I need something else ?A mutex is not a fence. The eviction fences we have in KFD are dma_fences. They interact with TTM's memory eviction logic, which allows us to stop user mode queues before TTM moves memory. A mutex cannot do that.
But now I realize that we actually want to keep all the GPU BOs (which are related to this queue, one of which is MQD) to be saved from internal TTM movement, and need to be embedded in form of dma/ttm interaction.
- Shashank
Regards, FelixRegards, FelixBut for a test this here is probably ok. Christian.- ShashankProbably best to add a comment here that this needs to be improved. Apart from that looks good to me, Christian.+ memset(queue->mqd.cpu_ptr, 0, mqd_gfx_generic->mqd_size); + + /* Initialize the MQD BO with user given values */ + userq_props.wptr_gpu_addr = mqd_user.wptr_va; + userq_props.rptr_gpu_addr = mqd_user.rptr_va; + userq_props.queue_size = mqd_user.queue_size; + userq_props.hqd_base_gpu_addr = mqd_user.queue_va; + userq_props.mqd_gpu_addr = queue->mqd.gpu_addr; + userq_props.use_doorbell = true; ++ r = mqd_gfx_generic->init_mqd(adev, (void *)queue->mqd.cpu_ptr, &userq_props);+ if (r) { + DRM_ERROR("Failed to initialize MQD for userqueue\n"); + goto free_mqd; + } + + return 0; + +free_mqd:+ amdgpu_bo_free_kernel(&queue->mqd.obj, &queue->mqd.gpu_addr, &queue->mqd.cpu_ptr);+ return r; +} + +static void+gfx_v11_0_userq_mqd_destroy(struct amdgpu_userq_mgr *uq_mgr, struct amdgpu_usermode_queue *queue)+{ + struct amdgpu_userq_obj *mqd = &queue->mqd; ++ amdgpu_bo_free_kernel(&mqd->obj, &mqd->gpu_addr, &mqd->cpu_ptr);+} + +const struct amdgpu_userq_funcs userq_gfx_v11_funcs = { + .mqd_create = gfx_v11_0_userq_mqd_create, + .mqd_destroy = gfx_v11_0_userq_mqd_destroy, +};diff --git a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h b/drivers/gpu/drm/amd/include/amdgpu_userqueue.hindex 55ed6512a565..240f92796f00 100644 --- a/drivers/gpu/drm/amd/include/amdgpu_userqueue.h +++ b/drivers/gpu/drm/amd/include/amdgpu_userqueue.h @@ -29,6 +29,12 @@ struct amdgpu_mqd_prop; +struct amdgpu_userq_obj { + void *cpu_ptr; + uint64_t gpu_addr; + struct amdgpu_bo *obj; +}; + struct amdgpu_usermode_queue { int queue_type; uint64_t doorbell_handle; @@ -37,6 +43,7 @@ struct amdgpu_usermode_queue { struct amdgpu_mqd_prop *userq_prop; struct amdgpu_userq_mgr *userq_mgr; struct amdgpu_vm *vm; + struct amdgpu_userq_obj mqd; }; struct amdgpu_userq_funcs {