[PATCH] [RFC] intel: Non-LLC based non-blocking maps.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The history on this patch goes back quite a way. This time around, the
patch builds on top of the map_unsynchronized that Eric pushed. Eric's
patch attempted only to solve the problem for LLC machines. Unlike
my earlier versions of this patch (with the help from Daniel Vetter), we
do not attempt to cpu map objects in a unsynchronized manner.

The concept is fairly simple - once a buffer is moved into the GTT
domain, we can assume it remains there unless we tell it otherwise (via
cpu map). It therefore stands to reason that as long as we can keep the
object in the GTT domain, and don't ever count on reading back contents,
things might just work. I believe as long as we are doing GTT mappings
only, we get to avoid worry about clflushing the dirtied cachelines, but
that could use some fact checking.

The patch makes some assumptions about how the kernel does buffer
tracking, this could be conceived as an ABI dependency, but actually the
behavior is pretty confined. It exploits the fact the BOs are only moved
into the CPU domain under certain circumstances, and daintily dances
around those conditions. The main thing here is we assume MADV_WILLNEED
prevents the object from getting evicted.

I am not aware of a good way to test it's effectiveness
performance-wise; but it introduces no regressions with piglit on my
ILK, or SNB.

Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
---
 intel/intel_bufmgr_gem.c |   46 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 34 insertions(+), 12 deletions(-)

diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c
index b776d2f..3872fbb 100644
--- a/intel/intel_bufmgr_gem.c
+++ b/intel/intel_bufmgr_gem.c
@@ -204,6 +204,11 @@ struct _drm_intel_bo_gem {
 	bool reusable;
 
 	/**
+	 * Boolean of whether or not we think this buffer is in the GTT domain.
+	 * */
+	bool in_gtt_domain;
+
+	/**
 	 * Size in bytes of this buffer and its relocation descendents.
 	 *
 	 * Used to avoid costly tree walking in
@@ -1188,6 +1193,9 @@ static int drm_intel_gem_bo_map(drm_intel_bo *bo, int write_enable)
 	if (write_enable)
 		bo_gem->mapped_cpu_write = true;
 
+	if (!bufmgr_gem->has_llc)
+		bo_gem->in_gtt_domain = false;
+
 	drm_intel_gem_bo_mark_mmaps_incoherent(bo);
 	VG(VALGRIND_MAKE_MEM_DEFINED(bo_gem->mem_virtual, bo->size));
 	pthread_mutex_unlock(&bufmgr_gem->lock);
@@ -1292,6 +1300,7 @@ int drm_intel_gem_bo_map_gtt(drm_intel_bo *bo)
 		    strerror(errno));
 	}
 
+	bo_gem->in_gtt_domain = true;
 	drm_intel_gem_bo_mark_mmaps_incoherent(bo);
 	VG(VALGRIND_MAKE_MEM_DEFINED(bo_gem->gtt_virtual, bo->size));
 	pthread_mutex_unlock(&bufmgr_gem->lock);
@@ -1311,27 +1320,39 @@ int drm_intel_gem_bo_map_gtt(drm_intel_bo *bo)
  * data that the GPU is busy using (or, more specifically, that if it
  * does write over the data, it acknowledges that rendering is
  * undefined).
+ *
  */
-
 int drm_intel_gem_bo_map_unsynchronized(drm_intel_bo *bo)
 {
 	drm_intel_bufmgr_gem *bufmgr_gem = (drm_intel_bufmgr_gem *) bo->bufmgr;
+	drm_intel_bo_gem *bo_gem = (drm_intel_bo_gem *) bo;
 	int ret;
 
-	/* If the CPU cache isn't coherent with the GTT, then use a
-	 * regular synchronized mapping.  The problem is that we don't
-	 * track where the buffer was last used on the CPU side in
-	 * terms of drm_intel_bo_map vs drm_intel_gem_bo_map_gtt, so
-	 * we would potentially corrupt the buffer even when the user
-	 * does reasonable things.
-	 */
-	if (!bufmgr_gem->has_llc)
-		return drm_intel_gem_bo_map_gtt(bo);
-
 	pthread_mutex_lock(&bufmgr_gem->lock);
+
 	ret = map_gtt(bo);
-	pthread_mutex_unlock(&bufmgr_gem->lock);
+	if (ret)
+		goto out;
+
+	/* LLC + gtt makes this function quite safe. */
+	if (!bufmgr_gem->has_llc) {
+		if (!bo_gem->in_gtt_domain) {
+			/* Put the object in the gtt domain. We can assume the
+			 * object remains there unless set_domain happens which
+			 * can only occur if the buffer is flinked, or we do it
+			 * ourself. The first two can be managed in userspace,
+			 * and the last one works as long as we assume
+			 * MADV_WILLNEED prevents unbinding.
+			 */
+			ret = drm_intel_gem_bo_madvise_internal(bufmgr_gem, bo_gem, MADV_WILLNEED);
+			if (ret)
+				goto out;
+			drm_intel_gem_bo_start_gtt_access(bo, 1);
+		}
+	}
 
+out:
+	pthread_mutex_unlock(&bufmgr_gem->lock);
 	return ret;
 }
 
@@ -1506,6 +1527,7 @@ drm_intel_gem_bo_start_gtt_access(drm_intel_bo *bo, int write_enable)
 		    set_domain.read_domains, set_domain.write_domain,
 		    strerror(errno));
 	}
+	bo_gem->in_gtt_domain = true;
 }
 
 static void
-- 
1.7.10.4



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux