[PATCH 0/6 v4] * Dedicated vmid per process v4 *

deathsimple@xxxxxxxxxxx (Christian König) · Thu, 27 Apr 2017 11:51:12 +0200

Patch #1, #2, #4 and #6 are Reviewed-by: Christian KÃ¶nig 
<christian.koenig at amd.com>.

Patch #3:
> +	/* Select the first entry VMID */
> +	idle = list_first_entry(&id_mgr->ids_lru, struct amdgpu_vm_id, list);
> +	list_del_init(&idle->list);
> +	vm->reserved_vmid[vmhub] = idle;
> +	mutex_unlock(&id_mgr->lock);
I think we should wait for the VMID to be completely idle before we can 
use it, but that possible also go into the handling in patch #5.

Patch #5:
> +	if ((amdgpu_vm_had_gpu_reset(adev, id)) ||
> +	    (atomic64_read(&id->owner) != vm->client_id) ||
> +	    (job->vm_pd_addr != id->pd_gpu_addr) ||
> +	    (updates && (!flushed || updates->context != flushed->context ||
> +			fence_is_later(updates, flushed)))) {
You also need this check here as well:
                 if (!id->last_flush ||
                     (id->last_flush->context != fence_context &&
                      !fence_is_signaled(id->last_flush)))

> +		tmp = amdgpu_sync_get_fence(&id->active);
> +		if (tmp) {
> +			r = amdgpu_sync_fence(adev, sync, tmp);
> +			fence_put(tmp);
> +			return r;
> +		}
That won't work correctly.

The first problem is that amdgpu_sync_get_fence() removes the fence from 
the active fences. So a another command submission from a different 
context won't wait for all the necessary fences. I think using 
amdgpu_sync_peek_fence() instead should work here.

The second problem is that a context could be starved when it needs a VM 
flush while another context can still submit jobs without a flush. I 
think you could work around that by setting id->pd_gpu_addr to an 
invalid value when we hit this case. This way all other contexts would 
need to do a VM flush as well.

> +	id->pd_gpu_addr = job->vm_pd_addr;
> +	id->current_gpu_reset_count = atomic_read(&adev->gpu_reset_counter);
> +	atomic64_set(&id->owner, vm->client_id);
> +	job->vm_needs_flush = needs_flush;
If we need a flush id->last_flush needs to be set to NULL here as well. 
E.g. do
         fence_put(id->last_flush);
         id->last_flush = NULL;

Regards,
Christian.

Am 27.04.2017 um 07:00 schrieb Chunming Zhou:
> The current kernel implementation, which grabs the idle VMID from pool when emitting the job may:
>
>      The back-to-back submission from one process could use different VMID.
>      The submission to different queues from single process could use different VMID
>
> It works well in most case but cannot work for the SQ thread trace capture.
>
> The VMID for the submission that set the {SQTT}_BASE, which refers to the address of the trace buffer, is stored in shader engine.
>
> If the profiling application have to use different VMIDs to submit IBs in its life cycle:
>
>      Some trace is not captured since it actually uses different VMID to submit jobs.
>      Some part of captured trace may come from different application since they are accidentally uses the ownerâ??s VMID to submit jobs.
>
> V2:
> 1. address Christian's comments:
> 	a. drop context flags for tag process, instead, add vm ioctl.
> 	b. change order of patches.
> 	c. sync waiting only when vm flush needs.
>
> 2. address Alex's comments;
> 	bump module version
>
> V3:
>    address Jerry and Christian's comments.
>    and only reserve gfxhub vmid
>
> v4:
>    address Jerry and Christian's comments.
>    fix some race condistions.
>
> Chunming Zhou (6):
>    drm/amdgpu: add vm ioctl
>    drm/amdgpu: add reserved vmid field in vm struct v2
>    drm/amdgpu: reserve/unreserve vmid by vm ioctl v4
>    drm/amdgpu: add limitation for dedicated vm number v4
>    drm/amdgpu: implement grab reserved vmid V3
>    drm/amdgpu: bump module verion for reserved vmid
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |   3 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c |   1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c  | 152 ++++++++++++++++++++++++++++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |   6 ++
>   include/uapi/drm/amdgpu_drm.h           |  22 +++++
>   5 files changed, 178 insertions(+), 6 deletions(-)
>

[PATCH 0/6 v4] *** Dedicated vmid per process v4 ***

[PATCH 0/6 v4] * Dedicated vmid per process v4 *