Re: [PATCH] [RFC]drm/ttm: fix scheduling balance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/25/2018 10:59 AM, Chunming Zhou wrote:
there is a scheduling balance issue about get node like:
a. process A allocates full memory and use it for submission.
b. process B tries to allocates memory, will wait for process A BO idle in eviction.
c. process A completes the job, process B eviction will put process A BO node,
but in the meantime, process C is comming to allocate BO, whill directly get node successfully, and do submission,
process B will again wait for process C BO idle.
d. repeat the above setps, process B could be delayed much more.

add a mutex to gerantee the allocation sequence for same domain. But there is a possibility that
visible vram could be evicted to invisilbe, the tricky is they are same domain manager, so which needs a special handling.

Change-Id: I260e8eb704f7b4788b071d3f641f21b242912256
Signed-off-by: Chunming Zhou <david1.zhou@xxxxxxx>

I think this is a good approach, however there are two things that IMO needs fixing.

1) You must lock mutex interruptible if requested. It's important for fast delivery of Xserver mouse signals, and also to avoid unkillable processes on GPU hang. 2) How do we resolve contradictary eviction order between buffers? You'd probably get lockdep warnings if that should happen. Now I'm not sure there are drivers that do this, and that would probably be a bug if they did. In principle you'd have to either establish a manager priority and document that drivers have to do that,
or use ww_mutexes.

/Thomas

---
  drivers/gpu/drm/ttm/ttm_bo.c    | 16 +++++++++++++++-
  include/drm/ttm/ttm_bo_api.h    |  2 ++
  include/drm/ttm/ttm_bo_driver.h |  1 +
  3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index d33a6bb742a1..f465f1d38711 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -969,7 +969,11 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
  		if (mem_type == TTM_PL_SYSTEM)
  			break;
+ if (ctx->man != man)
+			mutex_lock(&man->node_mutex);
  		ret = (*man->func->get_node)(man, bo, place, mem);
+		if (ctx->man != man)
+			mutex_unlock(&man->node_mutex);
  		if (unlikely(ret))
  			return ret;
@@ -991,6 +995,7 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, for (i = 0; i < placement->num_busy_placement; ++i) {
  		const struct ttm_place *place = &placement->busy_placement[i];
+		struct ttm_operation_ctx busy_ctx;
ret = ttm_mem_type_from_place(place, &mem_type);
  		if (ret)
@@ -1018,7 +1023,15 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
  			return 0;
  		}
- ret = ttm_bo_mem_force_space(bo, mem_type, place, mem, ctx);
+		memcpy(&busy_ctx, ctx, sizeof(busy_ctx));
+		if (ctx->man != man) {
+			mutex_lock(&man->node_mutex);
+			busy_ctx.man = man;
+		}
+		ret = ttm_bo_mem_force_space(bo, mem_type, place, mem,
+					     &busy_ctx);
+		if (ctx->man != man)
+			mutex_unlock(&man->node_mutex);
  		if (ret == 0 && mem->mm_node) {
  			mem->placement = cur_flags;
  			return 0;
@@ -1449,6 +1462,7 @@ int ttm_bo_init_mm(struct ttm_bo_device *bdev, unsigned type,
  	man->io_reserve_fastpath = true;
  	man->use_io_reserve_lru = false;
  	mutex_init(&man->io_reserve_mutex);
+	mutex_init(&man->node_mutex);
  	spin_lock_init(&man->move_lock);
  	INIT_LIST_HEAD(&man->io_reserve_lru);
diff --git a/include/drm/ttm/ttm_bo_api.h b/include/drm/ttm/ttm_bo_api.h
index 2cd025c2abe7..b08629050215 100644
--- a/include/drm/ttm/ttm_bo_api.h
+++ b/include/drm/ttm/ttm_bo_api.h
@@ -265,6 +265,7 @@ struct ttm_bo_kmap_obj {
   * @no_wait_gpu: Return immediately if the GPU is busy.
   * @allow_reserved_eviction: Allow eviction of reserved BOs.
   * @resv: Reservation object to allow reserved evictions with.
+ * @man: record ctx is under man->node_mutex.
   *
   * Context for TTM operations like changing buffer placement or general memory
   * allocation.
@@ -275,6 +276,7 @@ struct ttm_operation_ctx {
  	bool allow_reserved_eviction;
  	struct reservation_object *resv;
  	uint64_t bytes_moved;
+	struct ttm_mem_type_manager *man;
  };
/**
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 9b417eb2df20..d85d6ee4f54d 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -288,6 +288,7 @@ struct ttm_mem_type_manager {
  	uint32_t default_caching;
  	const struct ttm_mem_type_manager_func *func;
  	void *priv;
+	struct mutex node_mutex;
  	struct mutex io_reserve_mutex;
  	bool use_io_reserve_lru;
  	bool io_reserve_fastpath;


_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux