Re: [PATCH] drm/radeon: Add support for userspace fence waits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01.02.2012 12:31, Simon Farnsworth wrote:
Christian,

You said elsewhere that you have half-finished patches that illustrate the
interface Jerome prefers - if you send them to me, I can work on finishing
them.

Responding to the rest of the message:
Well it's probably easier to type that down again instead of searching in a two month old reflog....

Patches are attached, but be aware that they are WIP and beside compile testing completely untested.

Whats missing is the kernel interface, but that shouldn't be to complicated to implement.

Christian.


On Tuesday 31 January 2012, Jerome Glisse<j.glisse@xxxxxxxxx>  wrote:
On Tue, Jan 31, 2012 at 01:55:43PM -0500, David Airlie wrote:
Some comments below.

+	r = radeon_bo_pin(robj, RADEON_GEM_DOMAIN_GTT, NULL);
+	if (r) {
+		goto unreserve;
+	}
+
+	r = radeon_bo_kmap(robj,&buffer_data);
+	if (r) {
+		goto unpin;
+	}
+

Do you need to pin it? I think if you have it reserved and you are in here you shouldn't need to. (unless kmapping requires a pin)/
No you don't need to pin, actually it's bad to pin the buffer you might
want to wait on might be in vram.

I'll remove the pin/unpin calls - I was copying use from elsewhere in the
kernel, when trying to work out why it didn't work as expected.

Also you never assume that things could go wrong and that your value
might never be written to the buffer. So it will wait forever in the
kernel.

I'm using wait_event_interruptible_timeout - userspace can pass me a timeout
if it knows how long it expects to wait (and recover if it takes too long),
and will be interrupted by signals. In an earlier version of the code, I
didn't enable the EOP interrupts, and was able to recover userspace simply
by sending it a SIGSTOP/SIGCONT pair (until the next fence, when it stalled
again).

In that respect, this is no different to the existing situation in userspace
- if the GPU is reset before the EVENT_WRITE_EOP executes, userspace will
spin until a signal interrupts it. All that changes if userspace uses this
ioctl is that userspace will sleep instead of burning a hole in my CPU
budget - the recovery strategy is unchanged.

As i said in the other mail i would rather as a wait on submited cs
ioctl and add some return value from cs ioctl. I can't remember the
outcome of this discusion we had when we were doing virtual memory
support. Alex ? Michel ? better memory than i do ? :)

If someone has half-complete patches illustrating what you mean, I can take
on the work of finishing them off. I've cc'd Christian on this mail, as it
looks like he has a starting point.

One slight advantage this interface has over a "wait on submitted CS ioctl"
interface is that, given suitable userspace changes, it becomes possible to
submit multiple fences in one IB, and wait for them individually (or only
wait for a subset of them). However, I'm not sure that that's enough of an
advantage to outweigh the disadvantages, especially given that the userspace
changes required are fairly intrusive.

Cheers,
Jerome


+	radeon_irq_kms_sw_irq_get(rdev, ring);
+
+	fence_data = (uint32_t*)buffer_data;
+	timeout =
an
+ * interrupt and the value in the buffer might have changed.
+ */
+struct drm_radeon_gem_wait_user_fence {
+	uint32_t                handle;
+	uint32_t                ring;
+	uint64_t                offset;
+	uint32_t                value;
+	uint64_t                timeout_usec;
Align things here, 64 then 64 then 32 32 32 and a 32 pad.

Will do in v2.

>From 99f526793e869749209cedc60eb9f109e7046bb3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <deathsimple@xxxxxxxxxxx>
Date: Wed, 1 Feb 2012 15:58:20 +0100
Subject: [PATCH 1/2] drm/radeon: expand fence sequence values to 64 bit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

They are protected by a read/write lock anyway, so
we actually don't need to use the atomic type.

Signed-off-by: Christian König <deathsimple@xxxxxxxxxxx>
---
 drivers/gpu/drm/radeon/radeon.h       |    6 +++---
 drivers/gpu/drm/radeon/radeon_fence.c |   29 +++++++++++++++++------------
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 73e05cb..2f4fb4d 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -209,8 +209,8 @@ struct radeon_fence_driver {
 	uint32_t			scratch_reg;
 	uint64_t			gpu_addr;
 	volatile uint32_t		*cpu_addr;
-	atomic_t			seq;
-	uint32_t			last_seq;
+	uint64_t			seq;
+	uint64_t			last_seq;
 	unsigned long			last_jiffies;
 	unsigned long			last_timeout;
 	wait_queue_head_t		queue;
@@ -225,7 +225,7 @@ struct radeon_fence {
 	struct kref			kref;
 	struct list_head		list;
 	/* protected by radeon_fence.lock */
-	uint32_t			seq;
+	uint64_t			seq;
 	bool				emitted;
 	bool				signaled;
 	/* RB, DMA, etc. */
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index 64ea3dd..ac177c5 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -29,7 +29,6 @@
  *    Dave Airlie
  */
 #include <linux/seq_file.h>
-#include <linux/atomic.h>
 #include <linux/wait.h>
 #include <linux/list.h>
 #include <linux/kref.h>
@@ -70,7 +69,7 @@ int radeon_fence_emit(struct radeon_device *rdev, struct radeon_fence *fence)
 		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 		return 0;
 	}
-	fence->seq = atomic_add_return(1, &rdev->fence_drv[fence->ring].seq);
+	fence->seq = ++rdev->fence_drv[fence->ring].seq;
 	if (!rdev->ring[fence->ring].ready)
 		/* FIXME: cp is not running assume everythings is done right
 		 * away
@@ -90,12 +89,18 @@ static bool radeon_fence_poll_locked(struct radeon_device *rdev, int ring)
 {
 	struct radeon_fence *fence;
 	struct list_head *i, *n;
-	uint32_t seq;
+	uint64_t seq;
 	bool wake = false;
 	unsigned long cjiffies;
 
 	seq = radeon_fence_read(rdev, ring);
+	seq |= rdev->fence_drv[ring].last_seq & 0xFFFFFFFF00000000;
 	if (seq != rdev->fence_drv[ring].last_seq) {
+		if (seq < rdev->fence_drv[ring].last_seq) {
+			/* sequence wrapped around */
+			seq = (seq & 0xFFFFFFFF) | 
+			      (rdev->fence_drv[ring].seq & 0xFFFFFFFF00000000);
+		}
 		rdev->fence_drv[ring].last_seq = seq;
 		rdev->fence_drv[ring].last_jiffies = jiffies;
 		rdev->fence_drv[ring].last_timeout = RADEON_FENCE_JIFFIES_TIMEOUT;
@@ -216,7 +221,7 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 {
 	struct radeon_device *rdev;
 	unsigned long irq_flags, timeout;
-	u32 seq;
+	uint64_t last_seq;
 	int r;
 
 	if (fence == NULL) {
@@ -230,8 +235,8 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 	timeout = rdev->fence_drv[fence->ring].last_timeout;
 retry:
 	/* save current sequence used to check for GPU lockup */
-	seq = rdev->fence_drv[fence->ring].last_seq;
-	trace_radeon_fence_wait_begin(rdev->ddev, seq);
+	last_seq = rdev->fence_drv[fence->ring].last_seq;
+	trace_radeon_fence_wait_begin(rdev->ddev, last_seq);
 	if (intr) {
 		radeon_irq_kms_sw_irq_get(rdev, fence->ring);
 		r = wait_event_interruptible_timeout(rdev->fence_drv[fence->ring].queue,
@@ -246,7 +251,7 @@ retry:
 			 radeon_fence_signaled(fence), timeout);
 		radeon_irq_kms_sw_irq_put(rdev, fence->ring);
 	}
-	trace_radeon_fence_wait_end(rdev->ddev, seq);
+	trace_radeon_fence_wait_end(rdev->ddev, last_seq);
 	if (unlikely(!radeon_fence_signaled(fence))) {
 		/* we were interrupted for some reason and fence isn't
 		 * isn't signaled yet, resume wait
@@ -258,11 +263,11 @@ retry:
 		/* don't protect read access to rdev->fence_drv[t].last_seq
 		 * if we experiencing a lockup the value doesn't change
 		 */
-		if (seq == rdev->fence_drv[fence->ring].last_seq &&
+		if (last_seq == rdev->fence_drv[fence->ring].last_seq &&
 		    radeon_gpu_is_lockup(rdev, &rdev->ring[fence->ring])) {
 			/* good news we believe it's a lockup */
 			printk(KERN_WARNING "GPU lockup (waiting for 0x%08X last fence id 0x%08X)\n",
-			     fence->seq, seq);
+			     (uint32_t)fence->seq, (uint32_t)last_seq);
 			/* FIXME: what should we do ? marking everyone
 			 * as signaled for now
 			 */
@@ -403,7 +408,7 @@ int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring)
 	}
 	rdev->fence_drv[ring].cpu_addr = &rdev->wb.wb[index/4];
 	rdev->fence_drv[ring].gpu_addr = rdev->wb.gpu_addr + index;
-	radeon_fence_write(rdev, atomic_read(&rdev->fence_drv[ring].seq), ring);
+	radeon_fence_write(rdev, rdev->fence_drv[ring].seq, ring);
 	rdev->fence_drv[ring].initialized = true;
 	DRM_INFO("fence driver on ring %d use gpu addr 0x%08Lx and cpu addr 0x%p\n",
 		 ring, rdev->fence_drv[ring].gpu_addr, rdev->fence_drv[ring].cpu_addr);
@@ -416,7 +421,7 @@ static void radeon_fence_driver_init_ring(struct radeon_device *rdev, int ring)
 	rdev->fence_drv[ring].scratch_reg = -1;
 	rdev->fence_drv[ring].cpu_addr = NULL;
 	rdev->fence_drv[ring].gpu_addr = 0;
-	atomic_set(&rdev->fence_drv[ring].seq, 0);
+	rdev->fence_drv[ring].seq = 0;
 	INIT_LIST_HEAD(&rdev->fence_drv[ring].created);
 	INIT_LIST_HEAD(&rdev->fence_drv[ring].emitted);
 	INIT_LIST_HEAD(&rdev->fence_drv[ring].signaled);
@@ -481,7 +486,7 @@ static int radeon_debugfs_fence_info(struct seq_file *m, void *data)
 			fence = list_entry(rdev->fence_drv[i].emitted.prev,
 					   struct radeon_fence, list);
 			seq_printf(m, "Last emitted fence %p with 0x%08X\n",
-				   fence,  fence->seq);
+				   fence,  (uint32_t)fence->seq);
 		}
 	}
 	return 0;
-- 
1.7.5.4

>From 2d3d429627d94143b0a664388496dad06a48426a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <deathsimple@xxxxxxxxxxx>
Date: Wed, 1 Feb 2012 19:16:34 +0100
Subject: [PATCH 2/2] drm/radeon: interface waiting for fence values
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

To support waiting for fence values from usermode.

Signed-off-by: Christian König <deathsimple@xxxxxxxxxxx>
---
 drivers/gpu/drm/radeon/radeon_fence.c |   22 ++++++++++++++++++++++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index ac177c5..85893c3 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -334,6 +334,28 @@ int radeon_fence_wait_last(struct radeon_device *rdev, int ring)
 	return r;
 }
 
+static bool radeon_fence_value_reached(struct radeon_device *rdev,
+				       int ring, uint64_t value)
+{
+	unsigned long irq_flags;
+	bool result;
+	read_lock_irqsave(&rdev->fence_lock, irq_flags);
+	result = rdev->fence_drv[ring].last_seq <= value;
+	read_unlock_irqrestore(&rdev->fence_lock, irq_flags);
+	return result;
+}
+
+int radeon_fence_wait_value(struct radeon_device *rdev, int ring,
+			    uint64_t value, unsigned long timeout)
+{
+	int r;
+	radeon_irq_kms_sw_irq_get(rdev, ring);
+	r = wait_event_interruptible_timeout(rdev->fence_drv[ring].queue,
+			radeon_fence_value_reached(rdev, ring, value), timeout);
+	radeon_irq_kms_sw_irq_put(rdev, ring);
+	return r;
+}
+
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence)
 {
 	kref_get(&fence->kref);
-- 
1.7.5.4

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel

[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux