Re: [RFC 3/9] drm/i915: Convert requests to use struct fence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21/07/2015 08:05, Daniel Vetter wrote:
On Fri, Jul 17, 2015 at 03:31:17PM +0100, John.C.Harrison@xxxxxxxxx wrote:
From: John Harrison <John.C.Harrison@xxxxxxxxx>

There is a construct in the linux kernel called 'struct fence' that is intended
to keep track of work that is executed on hardware. I.e. it solves the basic
problem that the drivers 'struct drm_i915_gem_request' is trying to address. The
request structure does quite a lot more than simply track the execution progress
so is very definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain all the
advantages that provides.

This patch makes the first step of integrating a struct fence into the request.
It replaces the explicit reference count with that of the fence. It also
replaces the 'is completed' test with the fence's equivalent. Currently, that
simply chains on to the original request implementation. A future patch will
improve this.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx>
---
  drivers/gpu/drm/i915/i915_drv.h         | 45 +++++++++++++------------
  drivers/gpu/drm/i915/i915_gem.c         | 58 ++++++++++++++++++++++++++++++---
  drivers/gpu/drm/i915/intel_lrc.c        |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
  drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
  5 files changed, 80 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cf6761c..79d346c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -50,6 +50,7 @@
  #include <linux/intel-iommu.h>
  #include <linux/kref.h>
  #include <linux/pm_qos.h>
+#include <linux/fence.h>
/* General customization:
   */
@@ -2150,7 +2151,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
   * initial reference taken using kref_init
   */
  struct drm_i915_gem_request {
-	struct kref ref;
+	/**
+	 * Underlying object for implementing the signal/wait stuff.
+	 * NB: Never call fence_later() or return this fence object to user
+	 * land! Due to lazy allocation, scheduler re-ordering, pre-emption,
+	 * etc., there is no guarantee at all about the validity or
+	 * sequentiality of the fence's seqno! It is also unsafe to let
+	 * anything outside of the i915 driver get hold of the fence object
+	 * as the clean up when decrementing the reference count requires
+	 * holding the driver mutex lock.
+	 */
This comment is outdated.
Not at this point in the patch series. Until the 'delay freeing of requests' patch it is very definitely unsafe to allow external reference counting of the fence object as decrementing the count must only be done with the mutex lock held. Likewise, without the per context per ring timeline, the seqno value would unsafe if any scheduling code was added. The comment is written from the point of view of what happens if this patch is taken as a stand alone patch without the rest of the series following and describes the state of the driver at that moment in time.

Also I'm leaning towards squashing this patch
with the one implementing fences with explicit irq enabling, to avoid
churn and intermediate WARN_ONs. Each patch should be fully functional
without requiring follow-up patches.

It is fully functional as a stand alone patch. The intermediate WARN_ONs would never fire unless someone takes only this patch and the starts adding extra (unsupported) usage of the fence object. If they want to do that then they must first add in the extra support that they need. Or they could just take the rest of the series which adds in that support for them.

Squashing the interrupt support into the basic fence implementation would just make for a much more complicated single patch.


-Daniel


+	struct fence fence;
/** On Which ring this request was generated */
  	struct drm_i915_private *i915;
@@ -2227,7 +2238,13 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
  			   struct intel_context *ctx,
  			   struct drm_i915_gem_request **req_out);
  void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
+					      bool lazy_coherency)
+{
+	return fence_is_signaled(&req->fence);
+}
+
  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
  				   struct drm_file *file);
@@ -2247,7 +2264,7 @@ static inline struct drm_i915_gem_request *
  i915_gem_request_reference(struct drm_i915_gem_request *req)
  {
  	if (req)
-		kref_get(&req->ref);
+		fence_get(&req->fence);
  	return req;
  }
@@ -2255,7 +2272,7 @@ static inline void
  i915_gem_request_unreference(struct drm_i915_gem_request *req)
  {
  	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
-	kref_put(&req->ref, i915_gem_request_free);
+	fence_put(&req->fence);
  }
static inline void
@@ -2267,7 +2284,7 @@ i915_gem_request_unreference__unlocked(struct drm_i915_gem_request *req)
  		return;
dev = req->ring->dev;
-	if (kref_put_mutex(&req->ref, i915_gem_request_free, &dev->struct_mutex))
+	if (kref_put_mutex(&req->fence.refcount, fence_release, &dev->struct_mutex))
  		mutex_unlock(&dev->struct_mutex);
  }
@@ -2284,12 +2301,6 @@ static inline void i915_gem_request_assign(struct drm_i915_gem_request **pdst,
  }
/*
- * XXX: i915_gem_request_completed should be here but currently needs the
- * definition of i915_seqno_passed() which is below. It will be moved in
- * a later patch when the call to i915_seqno_passed() is obsoleted...
- */
-
-/*
   * A command that requires special handling by the command parser.
   */
  struct drm_i915_cmd_descriptor {
@@ -2851,18 +2862,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
  	return (int32_t)(seq1 - seq2) >= 0;
  }
-static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
-					      bool lazy_coherency)
-{
-	u32 seqno;
-
-	BUG_ON(req == NULL);
-
-	seqno = req->ring->get_seqno(req->ring, lazy_coherency);
-
-	return i915_seqno_passed(seqno, req->seqno);
-}
-
  int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno);
  int __must_check i915_gem_set_seqno(struct drm_device *dev, u32 seqno);
  int __must_check i915_gem_object_get_fence(struct drm_i915_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d9f2701..888bb72 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2616,12 +2616,14 @@ static void i915_set_reset_status(struct drm_i915_private *dev_priv,
  	}
  }
-void i915_gem_request_free(struct kref *req_ref)
+static void i915_gem_request_free(struct fence *req_fence)
  {
-	struct drm_i915_gem_request *req = container_of(req_ref,
-						 typeof(*req), ref);
+	struct drm_i915_gem_request *req = container_of(req_fence,
+						 typeof(*req), fence);
  	struct intel_context *ctx = req->ctx;
+ BUG_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
+
  	if (req->file_priv)
  		i915_gem_request_remove_from_client(req);
@@ -2637,6 +2639,47 @@ void i915_gem_request_free(struct kref *req_ref)
  	kmem_cache_free(req->i915->requests, req);
  }
+static const char *i915_gem_request_get_driver_name(struct fence *req_fence)
+{
+	return "i915_request";
+}
+
+static const char *i915_gem_request_get_timeline_name(struct fence *req_fence)
+{
+	struct drm_i915_gem_request *req = container_of(req_fence,
+						 typeof(*req), fence);
+	return req->ring->name;
+}
+
+static bool i915_gem_request_enable_signaling(struct fence *req_fence)
+{
+	/* Interrupt driven fences are not implemented yet.*/
+	WARN(true, "This should not be called!");
+	return true;
+}
+
+static bool i915_gem_request_is_completed(struct fence *req_fence)
+{
+	struct drm_i915_gem_request *req = container_of(req_fence,
+						 typeof(*req), fence);
+	u32 seqno;
+
+	BUG_ON(req == NULL);
+
+	seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
+
+	return i915_seqno_passed(seqno, req->seqno);
+}
+
+static const struct fence_ops i915_gem_request_fops = {
+	.get_driver_name	= i915_gem_request_get_driver_name,
+	.get_timeline_name	= i915_gem_request_get_timeline_name,
+	.enable_signaling	= i915_gem_request_enable_signaling,
+	.signaled		= i915_gem_request_is_completed,
+	.wait			= fence_default_wait,
+	.release		= i915_gem_request_free,
+};
+
  int i915_gem_request_alloc(struct intel_engine_cs *ring,
  			   struct intel_context *ctx,
  			   struct drm_i915_gem_request **req_out)
@@ -2658,7 +2701,6 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
  	if (ret)
  		goto err;
- kref_init(&req->ref);
  	req->i915 = dev_priv;
  	req->ring = ring;
  	req->ctx  = ctx;
@@ -2673,6 +2715,8 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
  		goto err;
  	}
+ fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock, ring->fence_context, req->seqno);
+
  	/*
  	 * Reserve space in the ring buffer for all the commands required to
  	 * eventually emit this request. This is to guarantee that the
@@ -5021,7 +5065,7 @@ i915_gem_init_hw(struct drm_device *dev)
  {
  	struct drm_i915_private *dev_priv = dev->dev_private;
  	struct intel_engine_cs *ring;
-	int ret, i, j;
+	int ret, i, j, fence_base;
if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
  		return -EIO;
@@ -5073,12 +5117,16 @@ i915_gem_init_hw(struct drm_device *dev)
  			goto out;
  	}
+ fence_base = fence_context_alloc(I915_NUM_RINGS);
+
  	/* Now it is safe to go back round and do everything else: */
  	for_each_ring(ring, dev_priv, i) {
  		struct drm_i915_gem_request *req;
WARN_ON(!ring->default_context); + ring->fence_context = fence_base + i;
+
  		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
  		if (ret) {
  			i915_gem_cleanup_ringbuffer(dev);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 9faad82..ee4aecd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1808,6 +1808,7 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
  	ring->dev = dev;
  	INIT_LIST_HEAD(&ring->active_list);
  	INIT_LIST_HEAD(&ring->request_list);
+	spin_lock_init(&ring->fence_lock);
  	i915_gem_batch_pool_init(dev, &ring->batch_pool);
  	init_waitqueue_head(&ring->irq_queue);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 177f7ed..d1ced30 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2040,6 +2040,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
  	INIT_LIST_HEAD(&ring->active_list);
  	INIT_LIST_HEAD(&ring->request_list);
  	INIT_LIST_HEAD(&ring->execlist_queue);
+	spin_lock_init(&ring->fence_lock);
  	i915_gem_batch_pool_init(dev, &ring->batch_pool);
  	ringbuf->size = 32 * PAGE_SIZE;
  	ringbuf->ring = ring;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2e85fda..a4b0545 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -346,6 +346,9 @@ struct  intel_engine_cs {
  	 * to encode the command length in the header).
  	 */
  	u32 (*get_cmd_length_mask)(u32 cmd_header);
+
+	unsigned fence_context;
+	spinlock_t fence_lock;
  };
bool intel_ring_initialized(struct intel_engine_cs *ring);
--
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux