Re: [PATCH 1/3] drm/radeon: stop poisoning the GART TLB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 24.06.2014 08:49, schrieb Michel Dänzer:
On 23.06.2014 18:56, Christian König wrote:
Am 23.06.2014 10:15, schrieb Michel Dänzer:
On 19.06.2014 18:45, Christian König wrote:

I think even when we revert to the old code we have a couple of unsolved
problems with the VM support or in the driver in general where we should
try to understand the underlying reason for it instead of applying more
workarounds.
I'm not suggesting applying more workarounds but going back to a known
more stable state. It seems like we've maneuvered ourselves to a rather
uncomfortable position from there, with no clear way to a better place.
But if we basically started from the 3.14 state again, we have a few
known hurdles like mine and Marek's Bonaire etc. which we know any
further improvements will have to pass before they can be considered for
general consumption.
Yeah agree, especially on the uncomfortable position.

Please try with the two attached patches applied on top of 3.15 and
retest. They should revert back to the old implementation.
Unfortunately, X fails to start with these, see the attached excerpt
from dmesg.

My fault, incorrectly solved a merge conflict and then failed to test the right kernel.

BTW: Wasn't there an option to tell grup to use the latest installed kernel instead of the one with the highest version number? Can't seem to find that any more.

Please try attached patches instead,
Christian.
>From 17300436e5598357cb9396d3d52c8c40064adc16 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@xxxxxxx>
Date: Mon, 23 Jun 2014 11:07:29 +0200
Subject: [PATCH 1/2] drm/radeon: Revert drop non blocking allocations from sub
 allocator

The next revert needs this functionality.

This reverts commit 4d1526466296360f56f93c195848c1202b0cc10b.
---
 drivers/gpu/drm/radeon/radeon_object.h    | 2 +-
 drivers/gpu/drm/radeon/radeon_ring.c      | 2 +-
 drivers/gpu/drm/radeon/radeon_sa.c        | 7 +++++--
 drivers/gpu/drm/radeon/radeon_semaphore.c | 2 +-
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index 9e7b25a..7dff64d 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -180,7 +180,7 @@ extern int radeon_sa_bo_manager_suspend(struct radeon_device *rdev,
 extern int radeon_sa_bo_new(struct radeon_device *rdev,
 			    struct radeon_sa_manager *sa_manager,
 			    struct radeon_sa_bo **sa_bo,
-			    unsigned size, unsigned align);
+			    unsigned size, unsigned align, bool block);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
 			      struct radeon_sa_bo **sa_bo,
 			      struct radeon_fence *fence);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index f8050f5..62201db 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -63,7 +63,7 @@ int radeon_ib_get(struct radeon_device *rdev, int ring,
 {
 	int r;
 
-	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &ib->sa_bo, size, 256);
+	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &ib->sa_bo, size, 256, true);
 	if (r) {
 		dev_err(rdev->dev, "failed to get a new IB (%d)\n", r);
 		return r;
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index adcf3e2..c062580 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -312,7 +312,7 @@ static bool radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager,
 int radeon_sa_bo_new(struct radeon_device *rdev,
 		     struct radeon_sa_manager *sa_manager,
 		     struct radeon_sa_bo **sa_bo,
-		     unsigned size, unsigned align)
+		     unsigned size, unsigned align, bool block)
 {
 	struct radeon_fence *fences[RADEON_NUM_RINGS];
 	unsigned tries[RADEON_NUM_RINGS];
@@ -353,11 +353,14 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 		r = radeon_fence_wait_any(rdev, fences, false);
 		spin_lock(&sa_manager->wq.lock);
 		/* if we have nothing to wait for block */
-		if (r == -ENOENT) {
+		if (r == -ENOENT && block) {
 			r = wait_event_interruptible_locked(
 				sa_manager->wq, 
 				radeon_sa_event(sa_manager, size, align)
 			);
+
+		} else if (r == -ENOENT) {
+			r = -ENOMEM;
 		}
 
 	} while (!r);
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c b/drivers/gpu/drm/radeon/radeon_semaphore.c
index dbd6bcd..6140af6 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -42,7 +42,7 @@ int radeon_semaphore_create(struct radeon_device *rdev,
 		return -ENOMEM;
 	}
 	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &(*semaphore)->sa_bo,
-			     8 * RADEON_NUM_SYNCS, 8);
+			     8 * RADEON_NUM_SYNCS, 8, true);
 	if (r) {
 		kfree(*semaphore);
 		*semaphore = NULL;
-- 
1.9.1

>From 28cd0733ff4b91b917962e964255f0a12278b29a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@xxxxxxx>
Date: Mon, 23 Jun 2014 11:08:24 +0200
Subject: [PATCH 2/2] drm/radeon: Revert use normal BOs for the page tables v2
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This reverts the commit "use normal BOs for the page tables v4" and the following depending bug fixes:

drm/radeon: sync page table updates
drm/radeon: fix vm buffer size estimation
drm/radeon: only allocate necessary size for vm bo list
drm/radeon: fix page directory update size estimation
drm/radeon: remove global vm lock

v2: fix incorrect merge conflict solving

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
---
 drivers/gpu/drm/radeon/radeon.h        |  24 +-
 drivers/gpu/drm/radeon/radeon_cs.c     |  48 ++-
 drivers/gpu/drm/radeon/radeon_device.c |   4 +-
 drivers/gpu/drm/radeon/radeon_kms.c    |   7 +-
 drivers/gpu/drm/radeon/radeon_ring.c   |   7 -
 drivers/gpu/drm/radeon/radeon_vm.c     | 513 ++++++++++++++++++---------------
 6 files changed, 309 insertions(+), 294 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 8149e7c..b390d79 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -854,22 +854,17 @@ struct radeon_mec {
 #define R600_PTE_READABLE	(1 << 5)
 #define R600_PTE_WRITEABLE	(1 << 6)
 
-struct radeon_vm_pt {
-	struct radeon_bo		*bo;
-	uint64_t			addr;
-};
-
 struct radeon_vm {
+	struct list_head		list;
 	struct list_head		va;
 	unsigned			id;
 
 	/* contains the page directory */
-	struct radeon_bo		*page_directory;
+	struct radeon_sa_bo		*page_directory;
 	uint64_t			pd_gpu_addr;
-	unsigned			max_pde_used;
 
 	/* array of page tables, one for each page directory entry */
-	struct radeon_vm_pt		*page_tables;
+	struct radeon_sa_bo		**page_tables;
 
 	struct mutex			mutex;
 	/* last fence for cs using this vm */
@@ -881,7 +876,10 @@ struct radeon_vm {
 };
 
 struct radeon_vm_manager {
+	struct mutex			lock;
+	struct list_head		lru_vm;
 	struct radeon_fence		*active[RADEON_NUM_VM];
+	struct radeon_sa_manager	sa_manager;
 	uint32_t			max_pfn;
 	/* number of VMIDs */
 	unsigned			nvm;
@@ -1013,7 +1011,6 @@ struct radeon_cs_parser {
 	unsigned		nrelocs;
 	struct radeon_cs_reloc	*relocs;
 	struct radeon_cs_reloc	**relocs_ptr;
-	struct radeon_cs_reloc	*vm_bos;
 	struct list_head	validated;
 	unsigned		dma_reloc_idx;
 	/* indices of various chunks */
@@ -2807,11 +2804,10 @@ extern void radeon_program_register_sequence(struct radeon_device *rdev,
  */
 int radeon_vm_manager_init(struct radeon_device *rdev);
 void radeon_vm_manager_fini(struct radeon_device *rdev);
-int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm);
+void radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm);
 void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm);
-struct radeon_cs_reloc *radeon_vm_get_bos(struct radeon_device *rdev,
-					  struct radeon_vm *vm,
-                                          struct list_head *head);
+int radeon_vm_alloc_pt(struct radeon_device *rdev, struct radeon_vm *vm);
+void radeon_vm_add_to_lru(struct radeon_device *rdev, struct radeon_vm *vm);
 struct radeon_fence *radeon_vm_grab_id(struct radeon_device *rdev,
 				       struct radeon_vm *vm, int ring);
 void radeon_vm_flush(struct radeon_device *rdev,
@@ -2821,8 +2817,6 @@ void radeon_vm_fence(struct radeon_device *rdev,
 		     struct radeon_vm *vm,
 		     struct radeon_fence *fence);
 uint64_t radeon_vm_map_gart(struct radeon_device *rdev, uint64_t addr);
-int radeon_vm_update_page_directory(struct radeon_device *rdev,
-				    struct radeon_vm *vm);
 int radeon_vm_bo_update(struct radeon_device *rdev,
 			struct radeon_vm *vm,
 			struct radeon_bo *bo,
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index 41ecf8a..06a00a1 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -173,10 +173,6 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
 
 	radeon_cs_buckets_get_list(&buckets, &p->validated);
 
-	if (p->cs_flags & RADEON_CS_USE_VM)
-		p->vm_bos = radeon_vm_get_bos(p->rdev, p->ib.vm,
-					      &p->validated);
-
 	return radeon_bo_list_validate(p->rdev, &p->ticket, &p->validated, p->ring);
 }
 
@@ -417,7 +413,6 @@ static void radeon_cs_parser_fini(struct radeon_cs_parser *parser, int error, bo
 	kfree(parser->track);
 	kfree(parser->relocs);
 	kfree(parser->relocs_ptr);
-	kfree(parser->vm_bos);
 	for (i = 0; i < parser->nchunks; i++)
 		drm_free_large(parser->chunks[i].kdata);
 	kfree(parser->chunks);
@@ -457,32 +452,24 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
 	return r;
 }
 
-static int radeon_bo_vm_update_pte(struct radeon_cs_parser *p,
+static int radeon_bo_vm_update_pte(struct radeon_cs_parser *parser,
 				   struct radeon_vm *vm)
 {
-	struct radeon_device *rdev = p->rdev;
-	int i, r;
-
-	r = radeon_vm_update_page_directory(rdev, vm);
-	if (r)
-		return r;
+	struct radeon_device *rdev = parser->rdev;
+	struct radeon_cs_reloc *lobj;
+	struct radeon_bo *bo;
+	int r;
 
-	r = radeon_vm_bo_update(rdev, vm, rdev->ring_tmp_bo.bo,
-				&rdev->ring_tmp_bo.bo->tbo.mem);
-	if (r)
+	r = radeon_vm_bo_update(rdev, vm, rdev->ring_tmp_bo.bo, &rdev->ring_tmp_bo.bo->tbo.mem);
+	if (r) {
 		return r;
-
-	for (i = 0; i < p->nrelocs; i++) {
-		struct radeon_bo *bo;
-
-		/* ignore duplicates */
-		if (p->relocs_ptr[i] != &p->relocs[i])
-			continue;
-
-		bo = p->relocs[i].robj;
-		r = radeon_vm_bo_update(rdev, vm, bo, &bo->tbo.mem);
-		if (r)
+	}
+	list_for_each_entry(lobj, &parser->validated, tv.head) {
+		bo = lobj->robj;
+		r = radeon_vm_bo_update(parser->rdev, vm, bo, &bo->tbo.mem);
+		if (r) {
 			return r;
+		}
 	}
 	return 0;
 }
@@ -514,13 +501,20 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 	if (parser->ring == R600_RING_TYPE_UVD_INDEX)
 		radeon_uvd_note_usage(rdev);
 
+	mutex_lock(&rdev->vm_manager.lock);
 	mutex_lock(&vm->mutex);
+	r = radeon_vm_alloc_pt(rdev, vm);
+	if (r) {
+		goto out;
+	}
 	r = radeon_bo_vm_update_pte(parser, vm);
 	if (r) {
 		goto out;
 	}
 	radeon_cs_sync_rings(parser);
 	radeon_semaphore_sync_to(parser->ib.semaphore, vm->fence);
+	radeon_semaphore_sync_to(parser->ib.semaphore,
+				 radeon_vm_grab_id(rdev, vm, parser->ring));
 
 	if ((rdev->family >= CHIP_TAHITI) &&
 	    (parser->chunk_const_ib_idx != -1)) {
@@ -530,7 +524,9 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 	}
 
 out:
+	radeon_vm_add_to_lru(rdev, vm);
 	mutex_unlock(&vm->mutex);
+	mutex_unlock(&rdev->vm_manager.lock);
 	return r;
 }
 
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 2cd144c..9ebd035 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1195,12 +1195,14 @@ int radeon_device_init(struct radeon_device *rdev,
 	r = radeon_gem_init(rdev);
 	if (r)
 		return r;
-
+	/* initialize vm here */
+	mutex_init(&rdev->vm_manager.lock);
 	/* Adjust VM size here.
 	 * Currently set to 4GB ((1 << 20) 4k pages).
 	 * Max GPUVM size for cayman and SI is 40 bits.
 	 */
 	rdev->vm_manager.max_pfn = 1 << 20;
+	INIT_LIST_HEAD(&rdev->vm_manager.lru_vm);
 
 	/* Set asic functions */
 	r = radeon_asic_init(rdev);
diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c
index eaaedba..cc47fa1 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -571,12 +571,7 @@ int radeon_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
 			return -ENOMEM;
 		}
 
-		r = radeon_vm_init(rdev, &fpriv->vm);
-		if (r) {
-			kfree(fpriv);
-			return r;
-		}
-
+		radeon_vm_init(rdev, &fpriv->vm);
 		if (rdev->accel_working) {
 			r = radeon_bo_reserve(rdev->ring_tmp_bo.bo, false);
 			if (r) {
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 62201db..4ddc6d77 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -145,13 +145,6 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib,
 		return r;
 	}
 
-	/* grab a vm id if necessary */
-	if (ib->vm) {
-		struct radeon_fence *vm_id_fence;
-		vm_id_fence = radeon_vm_grab_id(rdev, ib->vm, ib->ring);
-        	radeon_semaphore_sync_to(ib->semaphore, vm_id_fence);
-	}
-
 	/* sync with other rings */
 	r = radeon_semaphore_sync_rings(rdev, ib->semaphore, ib->ring);
 	if (r) {
diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
index c11b71d..5160176 100644
--- a/drivers/gpu/drm/radeon/radeon_vm.c
+++ b/drivers/gpu/drm/radeon/radeon_vm.c
@@ -84,19 +84,85 @@ static unsigned radeon_vm_directory_size(struct radeon_device *rdev)
  */
 int radeon_vm_manager_init(struct radeon_device *rdev)
 {
+	struct radeon_vm *vm;
+	struct radeon_bo_va *bo_va;
 	int r;
+	unsigned size;
 
 	if (!rdev->vm_manager.enabled) {
+		/* allocate enough for 2 full VM pts */
+		size = radeon_vm_directory_size(rdev);
+		size += rdev->vm_manager.max_pfn * 8;
+		size *= 2;
+		r = radeon_sa_bo_manager_init(rdev, &rdev->vm_manager.sa_manager,
+					      RADEON_GPU_PAGE_ALIGN(size),
+					      RADEON_VM_PTB_ALIGN_SIZE,
+					      RADEON_GEM_DOMAIN_VRAM);
+		if (r) {
+			dev_err(rdev->dev, "failed to allocate vm bo (%dKB)\n",
+				(rdev->vm_manager.max_pfn * 8) >> 10);
+			return r;
+		}
+
 		r = radeon_asic_vm_init(rdev);
 		if (r)
 			return r;
 
 		rdev->vm_manager.enabled = true;
+
+		r = radeon_sa_bo_manager_start(rdev, &rdev->vm_manager.sa_manager);
+		if (r)
+			return r;
+	}
+
+	/* restore page table */
+	list_for_each_entry(vm, &rdev->vm_manager.lru_vm, list) {
+		if (vm->page_directory == NULL)
+			continue;
+
+		list_for_each_entry(bo_va, &vm->va, vm_list) {
+			bo_va->valid = false;
+		}
 	}
 	return 0;
 }
 
 /**
+ * radeon_vm_free_pt - free the page table for a specific vm
+ *
+ * @rdev: radeon_device pointer
+ * @vm: vm to unbind
+ *
+ * Free the page table of a specific vm (cayman+).
+ *
+ * Global and local mutex must be lock!
+ */
+static void radeon_vm_free_pt(struct radeon_device *rdev,
+				    struct radeon_vm *vm)
+{
+	struct radeon_bo_va *bo_va;
+	int i;
+
+	if (!vm->page_directory)
+		return;
+
+	list_del_init(&vm->list);
+	radeon_sa_bo_free(rdev, &vm->page_directory, vm->fence);
+
+	list_for_each_entry(bo_va, &vm->va, vm_list) {
+		bo_va->valid = false;
+	}
+
+	if (vm->page_tables == NULL)
+		return;
+
+	for (i = 0; i < radeon_vm_num_pdes(rdev); i++)
+		radeon_sa_bo_free(rdev, &vm->page_tables[i], vm->fence);
+
+	kfree(vm->page_tables);
+}
+
+/**
  * radeon_vm_manager_fini - tear down the vm manager
  *
  * @rdev: radeon_device pointer
@@ -105,63 +171,155 @@ int radeon_vm_manager_init(struct radeon_device *rdev)
  */
 void radeon_vm_manager_fini(struct radeon_device *rdev)
 {
+	struct radeon_vm *vm, *tmp;
 	int i;
 
 	if (!rdev->vm_manager.enabled)
 		return;
 
-	for (i = 0; i < RADEON_NUM_VM; ++i)
+	mutex_lock(&rdev->vm_manager.lock);
+	/* free all allocated page tables */
+	list_for_each_entry_safe(vm, tmp, &rdev->vm_manager.lru_vm, list) {
+		mutex_lock(&vm->mutex);
+		radeon_vm_free_pt(rdev, vm);
+		mutex_unlock(&vm->mutex);
+	}
+	for (i = 0; i < RADEON_NUM_VM; ++i) {
 		radeon_fence_unref(&rdev->vm_manager.active[i]);
+	}
 	radeon_asic_vm_fini(rdev);
+	mutex_unlock(&rdev->vm_manager.lock);
+
+	radeon_sa_bo_manager_suspend(rdev, &rdev->vm_manager.sa_manager);
+	radeon_sa_bo_manager_fini(rdev, &rdev->vm_manager.sa_manager);
 	rdev->vm_manager.enabled = false;
 }
 
 /**
- * radeon_vm_get_bos - add the vm BOs to a validation list
+ * radeon_vm_evict - evict page table to make room for new one
+ *
+ * @rdev: radeon_device pointer
+ * @vm: VM we want to allocate something for
  *
- * @vm: vm providing the BOs
- * @head: head of validation list
+ * Evict a VM from the lru, making sure that it isn't @vm. (cayman+).
+ * Returns 0 for success, -ENOMEM for failure.
  *
- * Add the page directory to the list of BOs to
- * validate for command submission (cayman+).
+ * Global and local mutex must be locked!
  */
-struct radeon_cs_reloc *radeon_vm_get_bos(struct radeon_device *rdev,
-					  struct radeon_vm *vm,
-					  struct list_head *head)
+static int radeon_vm_evict(struct radeon_device *rdev, struct radeon_vm *vm)
 {
-	struct radeon_cs_reloc *list;
-	unsigned i, idx;
+	struct radeon_vm *vm_evict;
 
-	list = kmalloc_array(vm->max_pde_used + 2,
-			     sizeof(struct radeon_cs_reloc), GFP_KERNEL);
-	if (!list)
-		return NULL;
+	if (list_empty(&rdev->vm_manager.lru_vm))
+		return -ENOMEM;
 
-	/* add the vm page table to the list */
-	list[0].gobj = NULL;
-	list[0].robj = vm->page_directory;
-	list[0].domain = RADEON_GEM_DOMAIN_VRAM;
-	list[0].alt_domain = RADEON_GEM_DOMAIN_VRAM;
-	list[0].tv.bo = &vm->page_directory->tbo;
-	list[0].tiling_flags = 0;
-	list[0].handle = 0;
-	list_add(&list[0].tv.head, head);
-
-	for (i = 0, idx = 1; i <= vm->max_pde_used; i++) {
-		if (!vm->page_tables[i].bo)
-			continue;
+	vm_evict = list_first_entry(&rdev->vm_manager.lru_vm,
+				    struct radeon_vm, list);
+	if (vm_evict == vm)
+		return -ENOMEM;
+
+	mutex_lock(&vm_evict->mutex);
+	radeon_vm_free_pt(rdev, vm_evict);
+	mutex_unlock(&vm_evict->mutex);
+	return 0;
+}
 
-		list[idx].gobj = NULL;
-		list[idx].robj = vm->page_tables[i].bo;
-		list[idx].domain = RADEON_GEM_DOMAIN_VRAM;
-		list[idx].alt_domain = RADEON_GEM_DOMAIN_VRAM;
-		list[idx].tv.bo = &list[idx].robj->tbo;
-		list[idx].tiling_flags = 0;
-		list[idx].handle = 0;
-		list_add(&list[idx++].tv.head, head);
+/**
+ * radeon_vm_alloc_pt - allocates a page table for a VM
+ *
+ * @rdev: radeon_device pointer
+ * @vm: vm to bind
+ *
+ * Allocate a page table for the requested vm (cayman+).
+ * Returns 0 for success, error for failure.
+ *
+ * Global and local mutex must be locked!
+ */
+int radeon_vm_alloc_pt(struct radeon_device *rdev, struct radeon_vm *vm)
+{
+	unsigned pd_size, pd_entries, pts_size;
+	struct radeon_ib ib;
+	int r;
+
+	if (vm == NULL) {
+		return -EINVAL;
+	}
+
+	if (vm->page_directory != NULL) {
+		return 0;
+	}
+
+	pd_size = radeon_vm_directory_size(rdev);
+	pd_entries = radeon_vm_num_pdes(rdev);
+
+retry:
+	r = radeon_sa_bo_new(rdev, &rdev->vm_manager.sa_manager,
+			     &vm->page_directory, pd_size,
+			     RADEON_VM_PTB_ALIGN_SIZE, false);
+	if (r == -ENOMEM) {
+		r = radeon_vm_evict(rdev, vm);
+		if (r)
+			return r;
+		goto retry;
+
+	} else if (r) {
+		return r;
 	}
 
-	return list;
+	vm->pd_gpu_addr = radeon_sa_bo_gpu_addr(vm->page_directory);
+
+	/* Initially clear the page directory */
+	r = radeon_ib_get(rdev, R600_RING_TYPE_DMA_INDEX, &ib,
+			  NULL, pd_entries * 2 + 64);
+	if (r) {
+		radeon_sa_bo_free(rdev, &vm->page_directory, vm->fence);
+		return r;
+	}
+
+	ib.length_dw = 0;
+
+	radeon_asic_vm_set_page(rdev, &ib, vm->pd_gpu_addr,
+				0, pd_entries, 0, 0);
+
+	radeon_semaphore_sync_to(ib.semaphore, vm->fence);
+	r = radeon_ib_schedule(rdev, &ib, NULL);
+	if (r) {
+		radeon_ib_free(rdev, &ib);
+		radeon_sa_bo_free(rdev, &vm->page_directory, vm->fence);
+		return r;
+	}
+	radeon_fence_unref(&vm->fence);
+	vm->fence = radeon_fence_ref(ib.fence);
+	radeon_ib_free(rdev, &ib);
+	radeon_fence_unref(&vm->last_flush);
+
+	/* allocate page table array */
+	pts_size = radeon_vm_num_pdes(rdev) * sizeof(struct radeon_sa_bo *);
+	vm->page_tables = kzalloc(pts_size, GFP_KERNEL);
+
+	if (vm->page_tables == NULL) {
+		DRM_ERROR("Cannot allocate memory for page table array\n");
+		radeon_sa_bo_free(rdev, &vm->page_directory, vm->fence);
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/**
+ * radeon_vm_add_to_lru - add VMs page table to LRU list
+ *
+ * @rdev: radeon_device pointer
+ * @vm: vm to add to LRU
+ *
+ * Add the allocated page table to the LRU list (cayman+).
+ *
+ * Global mutex must be locked!
+ */
+void radeon_vm_add_to_lru(struct radeon_device *rdev, struct radeon_vm *vm)
+{
+	list_del_init(&vm->list);
+	list_add_tail(&vm->list, &rdev->vm_manager.lru_vm);
 }
 
 /**
@@ -235,14 +393,10 @@ void radeon_vm_flush(struct radeon_device *rdev,
 		     struct radeon_vm *vm,
 		     int ring)
 {
-	uint64_t pd_addr = radeon_bo_gpu_offset(vm->page_directory);
-
 	/* if we can't remember our last VM flush then flush now! */
 	/* XXX figure out why we have to flush all the time */
-	if (!vm->last_flush || true || pd_addr != vm->pd_gpu_addr) {
-		vm->pd_gpu_addr = pd_addr;
+	if (!vm->last_flush || true)
 		radeon_ring_vm_flush(rdev, ring, vm);
-	}
 }
 
 /**
@@ -342,63 +496,6 @@ struct radeon_bo_va *radeon_vm_bo_add(struct radeon_device *rdev,
 }
 
 /**
- * radeon_vm_clear_bo - initially clear the page dir/table
- *
- * @rdev: radeon_device pointer
- * @bo: bo to clear
- */
-static int radeon_vm_clear_bo(struct radeon_device *rdev,
-			      struct radeon_bo *bo)
-{
-        struct ttm_validate_buffer tv;
-        struct ww_acquire_ctx ticket;
-        struct list_head head;
-	struct radeon_ib ib;
-	unsigned entries;
-	uint64_t addr;
-	int r;
-
-        memset(&tv, 0, sizeof(tv));
-        tv.bo = &bo->tbo;
-
-        INIT_LIST_HEAD(&head);
-        list_add(&tv.head, &head);
-
-        r = ttm_eu_reserve_buffers(&ticket, &head);
-        if (r)
-		return r;
-
-        r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
-        if (r)
-                goto error;
-
-	addr = radeon_bo_gpu_offset(bo);
-	entries = radeon_bo_size(bo) / 8;
-
-	r = radeon_ib_get(rdev, R600_RING_TYPE_DMA_INDEX, &ib,
-			  NULL, entries * 2 + 64);
-	if (r)
-                goto error;
-
-	ib.length_dw = 0;
-
-	radeon_asic_vm_set_page(rdev, &ib, addr, 0, entries, 0, 0);
-
-	r = radeon_ib_schedule(rdev, &ib, NULL);
-	if (r)
-                goto error;
-
-	ttm_eu_fence_buffer_objects(&ticket, &head, ib.fence);
-	radeon_ib_free(rdev, &ib);
-
-	return 0;
-
-error:
-	ttm_eu_backoff_reservation(&ticket, &head);
-	return r;
-}
-
-/**
  * radeon_vm_bo_set_addr - set bos virtual address inside a vm
  *
  * @rdev: radeon_device pointer
@@ -422,8 +519,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
 	struct radeon_vm *vm = bo_va->vm;
 	struct radeon_bo_va *tmp;
 	struct list_head *head;
-	unsigned last_pfn, pt_idx;
-	int r;
+	unsigned last_pfn;
 
 	if (soffset) {
 		/* make sure object fit at this offset */
@@ -474,53 +570,8 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
 	bo_va->valid = false;
 	list_move(&bo_va->vm_list, head);
 
-	soffset = (soffset / RADEON_GPU_PAGE_SIZE) >> RADEON_VM_BLOCK_SIZE;
-	eoffset = (eoffset / RADEON_GPU_PAGE_SIZE) >> RADEON_VM_BLOCK_SIZE;
-
-	if (eoffset > vm->max_pde_used)
-		vm->max_pde_used = eoffset;
-
-	radeon_bo_unreserve(bo_va->bo);
-
-	/* walk over the address space and allocate the page tables */
-	for (pt_idx = soffset; pt_idx <= eoffset; ++pt_idx) {
-		struct radeon_bo *pt;
-
-		if (vm->page_tables[pt_idx].bo)
-			continue;
-
-		/* drop mutex to allocate and clear page table */
-		mutex_unlock(&vm->mutex);
-
-		r = radeon_bo_create(rdev, RADEON_VM_PTE_COUNT * 8,
-				     RADEON_GPU_PAGE_SIZE, false, 
-				     RADEON_GEM_DOMAIN_VRAM, NULL, &pt);
-		if (r)
-			return r;
-
-		r = radeon_vm_clear_bo(rdev, pt);
-		if (r) {
-			radeon_bo_unref(&pt);
-			radeon_bo_reserve(bo_va->bo, false);
-			return r;
-		}
-
-		/* aquire mutex again */
-		mutex_lock(&vm->mutex);
-		if (vm->page_tables[pt_idx].bo) {
-			/* someone else allocated the pt in the meantime */
-			mutex_unlock(&vm->mutex);
-			radeon_bo_unref(&pt);
-			mutex_lock(&vm->mutex);
-			continue;
-		}
-
-		vm->page_tables[pt_idx].addr = 0;
-		vm->page_tables[pt_idx].bo = pt;
-	}
-
 	mutex_unlock(&vm->mutex);
-	return radeon_bo_reserve(bo_va->bo, false);
+	return 0;
 }
 
 /**
@@ -580,54 +631,58 @@ static uint32_t radeon_vm_page_flags(uint32_t flags)
  *
  * Global and local mutex must be locked!
  */
-int radeon_vm_update_page_directory(struct radeon_device *rdev,
-				    struct radeon_vm *vm)
+static int radeon_vm_update_pdes(struct radeon_device *rdev,
+				 struct radeon_vm *vm,
+				 struct radeon_ib *ib,
+				 uint64_t start, uint64_t end)
 {
 	static const uint32_t incr = RADEON_VM_PTE_COUNT * 8;
 
-	struct radeon_bo *pd = vm->page_directory;
-	uint64_t pd_addr = radeon_bo_gpu_offset(pd);
 	uint64_t last_pde = ~0, last_pt = ~0;
-	unsigned count = 0, pt_idx, ndw;
-	struct radeon_ib ib;
+	unsigned count = 0;
+	uint64_t pt_idx;
 	int r;
 
-	/* padding, etc. */
-	ndw = 64;
-
-	/* assume the worst case */
-	ndw += vm->max_pde_used * 16;
-
-	/* update too big for an IB */
-	if (ndw > 0xfffff)
-		return -ENOMEM;
-
-	r = radeon_ib_get(rdev, R600_RING_TYPE_DMA_INDEX, &ib, NULL, ndw * 4);
-	if (r)
-		return r;
-	ib.length_dw = 0;
+	start = (start / RADEON_GPU_PAGE_SIZE) >> RADEON_VM_BLOCK_SIZE;
+	end = (end / RADEON_GPU_PAGE_SIZE) >> RADEON_VM_BLOCK_SIZE;
 
 	/* walk over the address space and update the page directory */
-	for (pt_idx = 0; pt_idx <= vm->max_pde_used; ++pt_idx) {
-		struct radeon_bo *bo = vm->page_tables[pt_idx].bo;
+	for (pt_idx = start; pt_idx <= end; ++pt_idx) {
 		uint64_t pde, pt;
 
-		if (bo == NULL)
+		if (vm->page_tables[pt_idx])
 			continue;
 
-		pt = radeon_bo_gpu_offset(bo);
-		if (vm->page_tables[pt_idx].addr == pt)
-			continue;
-		vm->page_tables[pt_idx].addr = pt;
+retry:
+		r = radeon_sa_bo_new(rdev, &rdev->vm_manager.sa_manager,
+				     &vm->page_tables[pt_idx],
+				     RADEON_VM_PTE_COUNT * 8,
+				     RADEON_GPU_PAGE_SIZE, false);
+
+		if (r == -ENOMEM) {
+			r = radeon_vm_evict(rdev, vm);
+			if (r)
+				return r;
+			goto retry;
+		} else if (r) {
+			return r;
+		}
+
+		pde = vm->pd_gpu_addr + pt_idx * 8;
+
+		pt = radeon_sa_bo_gpu_addr(vm->page_tables[pt_idx]);
 
-		pde = pd_addr + pt_idx * 8;
 		if (((last_pde + 8 * count) != pde) ||
 		    ((last_pt + incr * count) != pt)) {
 
 			if (count) {
-				radeon_asic_vm_set_page(rdev, &ib, last_pde,
+				radeon_asic_vm_set_page(rdev, ib, last_pde,
 							last_pt, count, incr,
 							R600_PTE_VALID);
+
+				count *= RADEON_VM_PTE_COUNT;
+				radeon_asic_vm_set_page(rdev, ib, last_pt, 0,
+							count, 0, 0);
 			}
 
 			count = 1;
@@ -638,23 +693,14 @@ int radeon_vm_update_page_directory(struct radeon_device *rdev,
 		}
 	}
 
-	if (count)
-		radeon_asic_vm_set_page(rdev, &ib, last_pde, last_pt, count,
+	if (count) {
+		radeon_asic_vm_set_page(rdev, ib, last_pde, last_pt, count,
 					incr, R600_PTE_VALID);
 
-	if (ib.length_dw != 0) {
-		radeon_semaphore_sync_to(ib.semaphore, pd->tbo.sync_obj);
-		radeon_semaphore_sync_to(ib.semaphore, vm->last_id_use);
-		r = radeon_ib_schedule(rdev, &ib, NULL);
-		if (r) {
-			radeon_ib_free(rdev, &ib);
-			return r;
-		}
-		radeon_fence_unref(&vm->fence);
-		vm->fence = radeon_fence_ref(ib.fence);
-		radeon_fence_unref(&vm->last_flush);
+		count *= RADEON_VM_PTE_COUNT;
+		radeon_asic_vm_set_page(rdev, ib, last_pt, 0,
+					count, 0, 0);
 	}
-	radeon_ib_free(rdev, &ib);
 
 	return 0;
 }
@@ -691,18 +737,15 @@ static void radeon_vm_update_ptes(struct radeon_device *rdev,
 	/* walk over the address space and update the page tables */
 	for (addr = start; addr < end; ) {
 		uint64_t pt_idx = addr >> RADEON_VM_BLOCK_SIZE;
-		struct radeon_bo *pt = vm->page_tables[pt_idx].bo;
 		unsigned nptes;
 		uint64_t pte;
 
-		radeon_semaphore_sync_to(ib->semaphore, pt->tbo.sync_obj);
-
 		if ((addr & ~mask) == (end & ~mask))
 			nptes = end - addr;
 		else
 			nptes = RADEON_VM_PTE_COUNT - (addr & mask);
 
-		pte = radeon_bo_gpu_offset(pt);
+		pte = radeon_sa_bo_gpu_addr(vm->page_tables[pt_idx]);
 		pte += (addr & mask) * 8;
 
 		if ((last_pte + 8 * count) != pte) {
@@ -743,7 +786,7 @@ static void radeon_vm_update_ptes(struct radeon_device *rdev,
  * Fill in the page table entries for @bo (cayman+).
  * Returns 0 for success, -EINVAL for failure.
  *
- * Object have to be reserved and mutex must be locked!
+ * Object have to be reserved & global and local mutex must be locked!
  */
 int radeon_vm_bo_update(struct radeon_device *rdev,
 			struct radeon_vm *vm,
@@ -752,10 +795,14 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
 {
 	struct radeon_ib ib;
 	struct radeon_bo_va *bo_va;
-	unsigned nptes, ndw;
+	unsigned nptes, npdes, ndw;
 	uint64_t addr;
 	int r;
 
+	/* nothing to do if vm isn't bound */
+	if (vm->page_directory == NULL)
+		return 0;
+
 	bo_va = radeon_vm_bo_find(vm, bo);
 	if (bo_va == NULL) {
 		dev_err(rdev->dev, "bo %p not in vm %p\n", bo, vm);
@@ -793,6 +840,9 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
 
 	nptes = radeon_bo_ngpu_pages(bo);
 
+	/* assume two extra pdes in case the mapping overlaps the borders */
+	npdes = (nptes >> RADEON_VM_BLOCK_SIZE) + 2;
+
 	/* padding, etc. */
 	ndw = 64;
 
@@ -807,6 +857,15 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
 	/* reserve space for pte addresses */
 	ndw += nptes * 2;
 
+	/* reserve space for one header for every 2k dwords */
+	ndw += (npdes >> 11) * 4;
+
+	/* reserve space for pde addresses */
+	ndw += npdes * 2;
+
+	/* reserve space for clearing new page tables */
+	ndw += npdes * 2 * RADEON_VM_PTE_COUNT;
+
 	/* update too big for an IB */
 	if (ndw > 0xfffff)
 		return -ENOMEM;
@@ -816,6 +875,12 @@ int radeon_vm_bo_update(struct radeon_device *rdev,
 		return r;
 	ib.length_dw = 0;
 
+	r = radeon_vm_update_pdes(rdev, vm, &ib, bo_va->soffset, bo_va->eoffset);
+	if (r) {
+		radeon_ib_free(rdev, &ib);
+		return r;
+	}
+
 	radeon_vm_update_ptes(rdev, vm, &ib, bo_va->soffset, bo_va->eoffset,
 			      addr, radeon_vm_page_flags(bo_va->flags));
 
@@ -851,10 +916,12 @@ int radeon_vm_bo_rmv(struct radeon_device *rdev,
 {
 	int r = 0;
 
+	mutex_lock(&rdev->vm_manager.lock);
 	mutex_lock(&bo_va->vm->mutex);
-	if (bo_va->soffset)
+	if (bo_va->soffset) {
 		r = radeon_vm_bo_update(rdev, bo_va->vm, bo_va->bo, NULL);
-
+	}
+	mutex_unlock(&rdev->vm_manager.lock);
 	list_del(&bo_va->vm_list);
 	mutex_unlock(&bo_va->vm->mutex);
 	list_del(&bo_va->bo_list);
@@ -890,43 +957,15 @@ void radeon_vm_bo_invalidate(struct radeon_device *rdev,
  *
  * Init @vm fields (cayman+).
  */
-int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm)
+void radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm)
 {
-	unsigned pd_size, pd_entries, pts_size;
-	int r;
-
 	vm->id = 0;
 	vm->fence = NULL;
 	vm->last_flush = NULL;
 	vm->last_id_use = NULL;
 	mutex_init(&vm->mutex);
+	INIT_LIST_HEAD(&vm->list);
 	INIT_LIST_HEAD(&vm->va);
-
-	pd_size = radeon_vm_directory_size(rdev);
-	pd_entries = radeon_vm_num_pdes(rdev);
-
-	/* allocate page table array */
-	pts_size = pd_entries * sizeof(struct radeon_vm_pt);
-	vm->page_tables = kzalloc(pts_size, GFP_KERNEL);
-	if (vm->page_tables == NULL) {
-		DRM_ERROR("Cannot allocate memory for page table array\n");
-		return -ENOMEM;
-	}
-
-	r = radeon_bo_create(rdev, pd_size, RADEON_VM_PTB_ALIGN_SIZE, false,
-			     RADEON_GEM_DOMAIN_VRAM, NULL,
-			     &vm->page_directory);
-	if (r)
-		return r;
-
-	r = radeon_vm_clear_bo(rdev, vm->page_directory);
-	if (r) {
-		radeon_bo_unref(&vm->page_directory);
-		vm->page_directory = NULL;
-		return r;
-	}
-
-	return 0;
 }
 
 /**
@@ -941,7 +980,12 @@ int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm)
 void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm)
 {
 	struct radeon_bo_va *bo_va, *tmp;
-	int i, r;
+	int r;
+
+	mutex_lock(&rdev->vm_manager.lock);
+	mutex_lock(&vm->mutex);
+	radeon_vm_free_pt(rdev, vm);
+	mutex_unlock(&rdev->vm_manager.lock);
 
 	if (!list_empty(&vm->va)) {
 		dev_err(rdev->dev, "still active bo inside vm\n");
@@ -955,17 +999,8 @@ void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm)
 			kfree(bo_va);
 		}
 	}
-
-
-	for (i = 0; i < radeon_vm_num_pdes(rdev); i++)
-		radeon_bo_unref(&vm->page_tables[i].bo);
-	kfree(vm->page_tables);
-
-	radeon_bo_unref(&vm->page_directory);
-
 	radeon_fence_unref(&vm->fence);
 	radeon_fence_unref(&vm->last_flush);
 	radeon_fence_unref(&vm->last_id_use);
-
-	mutex_destroy(&vm->mutex);
+	mutex_unlock(&vm->mutex);
 }
-- 
1.9.1

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux