For xnack on, if range ACCESS or ACCESS_IN_PLACE (AIP) by single GPU, or range is ACCESS_IN_PLACE by mGPUs and all mGPUs connection on xgmi same hive, the best prefetch location is prefetch_loc GPU. Otherwise, the best prefetch location is always CPU because GPU can not map vram of other GPUs through small bar PCIe. Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx> --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 44 +++++++++++++++++----------- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 69237d2ab2ad..6160c301f78b 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2754,22 +2754,29 @@ svm_range_add(struct kfd_process *p, uint64_t start, uint64_t size, return 0; } -/* svm_range_best_prefetch_location - decide the best prefetch location +/** + * svm_range_best_prefetch_location - decide the best prefetch location * @prange: svm range structure * * For xnack off: - * If range map to single GPU, the best acutal location is prefetch loc, which + * If range map to single GPU, the best prefetch location is prefetch_loc, which * can be CPU or GPU. * - * If range map to multiple GPUs, only if mGPU connection on xgmi same hive, - * the best actual location could be prefetch_loc GPU. If mGPU connection on - * PCIe, the best actual location is always CPU, because GPU cannot access vram - * of other GPUs, assuming PCIe small bar (large bar support is not upstream). + * If range is ACCESS or ACCESS_IN_PLACE by mGPUs, only if mGPU connection on + * xgmi same hive, the best prefetch location is prefetch_loc GPU. If mGPUs + * connection on PCIe, the best prefetch location is always CPU, because GPU + * cannot map vram of other GPUs, assuming PCIe small bar (large bar support is + * not upstream). * * For xnack on: - * The best actual location is prefetch location. If mGPU connection on xgmi - * same hive, range map to multiple GPUs. Otherwise, the range only map to - * actual location GPU. Other GPU access vm fault will trigger migration. + * If range map to single GPU or is not ACCESS_IN_PLACE by mGPUs, the best + * prefetch location is prefetch_loc, other GPU access vm fault will trigger + * migration. + * + * If range is ACCESS_IN_PLACE by mGPUs, only if mGPU connection on xgmi same + * hive, the best prefetch location is prefetch_loc GPU. If mGPUs connection on + * PCIe, the best prefetch location is always CPU, because GPU cannot map vram + * of other GPUs. * * Context: Process context * @@ -2787,14 +2794,13 @@ svm_range_best_prefetch_location(struct svm_range *prange) struct kfd_process *p; uint32_t gpuidx; - p = container_of(prange->svms, struct kfd_process, svms); - - /* xnack on */ - if (p->xnack_enabled) + if (!best_loc || best_loc == KFD_IOCTL_SVM_LOCATION_UNDEFINED) goto out; - /* xnack off */ - if (!best_loc || best_loc == KFD_IOCTL_SVM_LOCATION_UNDEFINED) + p = container_of(prange->svms, struct kfd_process, svms); + + if (p->xnack_enabled && + bitmap_weight(prange->bitmap_aip, MAX_GPU_INSTANCE) <= 1) goto out; bo_adev = svm_range_get_adev_by_id(prange, best_loc); @@ -2803,8 +2809,12 @@ svm_range_best_prefetch_location(struct svm_range *prange) best_loc = 0; goto out; } - bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip, - MAX_GPU_INSTANCE); + + if (p->xnack_enabled) + bitmap_copy(bitmap, prange->bitmap_aip, MAX_GPU_INSTANCE); + else + bitmap_or(bitmap, prange->bitmap_access, prange->bitmap_aip, + MAX_GPU_INSTANCE); for_each_set_bit(gpuidx, bitmap, MAX_GPU_INSTANCE) { pdd = kfd_process_device_from_gpuidx(p, gpuidx); -- 2.17.1