Am 12.11.24 um 19:16 schrieb Li, Yunxiang (Teddy):
[Public]
From: Christian König <ckoenig.leichtzumerken@xxxxxxxxx>
Sent: Tuesday, November 12, 2024 5:54
Am 10.11.24 um 16:41 schrieb Yunxiang Li:
@@ -310,6 +311,92 @@ static void amdgpu_vm_bo_reset_state_machine(struct
amdgpu_vm *vm)
spin_unlock(&vm->status_lock);
}
+/**
+ * amdgpu_vm_update_shared - helper to update shared memory stat
+ * @base: base structure for tracking BO usage in a VM
+ * @sign: if we should add (+1) or subtract (-1) from the shared stat
+ *
+ * Takes the vm status_lock and updates the shared memory stat. If
+the basic
+ * stat changed (e.g. buffer was moved) amdgpu_vm_update_stats need
+to be called
+ * as well.
+ */
+static void amdgpu_vm_update_shared(struct amdgpu_vm_bo_base *base,
+int sign) {
+ struct amdgpu_vm *vm = base->vm;
+ struct amdgpu_bo *bo = base->bo;
+ struct ttm_resource *res;
+ int64_t size;
+ uint32_t type;
+
+ if (!vm || !bo)
+ return;
+
+ size = sign * amdgpu_bo_size(bo);
+ res = bo->tbo.resource;
+ type = res ? res->mem_type : amdgpu_bo_get_preferred_placement(bo);
Again, it's a clear NAK from my side to do stuff like that.
When there isn't any backing store the BO should *not* be accounted to anything.
I don't have a preference either way, but I think it should be a separate discussion to properly define what drm-total- means.
+ type = res ? res->mem_type : amdgpu_bo_get_preferred_placement(bo);
+ shared = drm_gem_object_is_shared_for_memory_stats(&bo->tbo.base);
+
+ if (type >= __AMDGPU_PL_LAST)
+ return;
+
+ spin_lock(&vm->status_lock);
+
+ if (shared)
+ vm->stats[type].drm.shared += size;
+ else
+ vm->stats[type].drm.private += size;
+ if (res)
+ vm->stats[type].drm.resident += size;
+ if (bo->flags & AMDGPU_GEM_CREATE_DISCARDABLE)
+ vm->stats[type].drm.purgeable += size;
+
+ if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) {
+ vm->stats[TTM_PL_VRAM].requested += size;
+ if (type != TTM_PL_VRAM)
+ vm->stats[TTM_PL_VRAM].evicted += size;
Again that is incorrect. BOs can be created with VRAM|GTT as their placement.
If such a BO is placed into GTT that doesn't mean it is evicted.
In that case, do we count BO with VRAM|GTT in both VRAM and GTT's .requested field? and if they are not in either, they go in both .evicted field?
Oh, good question depends on the definition of the requested field.
Accounting it to VRAM.evicted while GTT placement is desirable as well
is certainly not correct.
From my understanding they should go into the VRAM request, but not
account to evicted. But Tvrtko might see that differently.
@@ -2612,7 +2707,6 @@ void amdgpu_vm_fini(struct amdgpu_device *adev,
struct amdgpu_vm *vm)
root = amdgpu_bo_ref(vm->root.bo);
amdgpu_bo_reserve(root, true);
- amdgpu_vm_put_task_info(vm->task_info);
amdgpu_vm_set_pasid(adev, vm, 0);
dma_fence_wait(vm->last_unlocked, false);
dma_fence_put(vm->last_unlocked);
@@ -2660,6 +2754,15 @@ void amdgpu_vm_fini(struct amdgpu_device *adev,
struct amdgpu_vm *vm)
}
}
+ if (!amdgpu_vm_stats_is_zero(vm)) {
+ struct amdgpu_task_info *ti = vm->task_info;
+
+ dev_warn(adev->dev,
+ "VM memory stats for proc %s(%d) task %s(%d) is non-zero
when fini\n",
+ ti->process_name, ti->pid, ti->task_name, ti->tgid);
+ }
+
+ amdgpu_vm_put_task_info(vm->task_info);
Please don't move the call to amdgpu_vm_put_task_info().
Is keeping the task_info alive a hazard here? I could copy out the info, it just seemed a bit wasteful.
Ah, now I see why you have moved that.
IIRC we need to free up the task info before releasing the PASID, but
that info might be outdated. Need to check the code.
Does it work if you move the message further up or does the root PD then
break your neck because it isn't released yet?
Thanks,
Christian.
Regards,
Teddy