Re: [PATCH v4] drm/i915/guc: Clean up locks in GuC

Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> · Tue, 26 Jan 2016 11:53:45 +0000

On 25/01/16 16:01, Daniel Vetter wrote:
On Fri, Jan 22, 2016 at 10:54:42AM +0000, Tvrtko Ursulin wrote:

On 03/12/15 00:56, yu.dai@xxxxxxxxx wrote:
From: Alex Dai <yu.dai@xxxxxxxxx>

For now, remove the spinlocks that protected the GuC's
statistics block and work queue; they are only accessed
by code that already holds the global struct_mutex, and
so are redundant (until the big struct_mutex rewrite!).

The specific problem that the spinlocks caused was that
if the work queue was full, the driver would try to
spinwait for one jiffy, but with interrupts disabled the
jiffy count would not advance, leading to a system hang.
The issue was found using test case igt/gem_close_race.

The new version will usleep() instead, still holding
the struct_mutex but without any spinlocks.

v4: Reorganize commit message (Dave Gordon)
v3: Remove unnecessary whitespace churn
v2: Clean up wq_lock too
v1: Clean up host2guc lock as well

Signed-off-by: Alex Dai <yu.dai@xxxxxxxxx>
---
   drivers/gpu/drm/i915/i915_debugfs.c        | 12 ++++++------
   drivers/gpu/drm/i915/i915_guc_submission.c | 31 ++++++------------------------
   drivers/gpu/drm/i915/intel_guc.h           |  4 ----
   3 files changed, 12 insertions(+), 35 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff1..d6b7817 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2473,15 +2473,15 @@ static int i915_guc_info(struct seq_file *m, void *data)
   	if (!HAS_GUC_SCHED(dev_priv->dev))
   		return 0;

+	if (mutex_lock_interruptible(&dev->struct_mutex))
+		return 0;
+
   	/* Take a local copy of the GuC data, so we can dump it at leisure */
-	spin_lock(&dev_priv->guc.host2guc_lock);
   	guc = dev_priv->guc;
-	if (guc.execbuf_client) {
-		spin_lock(&guc.execbuf_client->wq_lock);
+	if (guc.execbuf_client)
   		client = *guc.execbuf_client;
-		spin_unlock(&guc.execbuf_client->wq_lock);
-	}
-	spin_unlock(&dev_priv->guc.host2guc_lock);
+
+	mutex_unlock(&dev->struct_mutex);

   	seq_printf(m, "GuC total action count: %llu\n", guc.action_count);
   	seq_printf(m, "GuC action failure count: %u\n", guc.action_fail);
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index ed9f100..a7f9785 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -86,7 +86,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
   		return -EINVAL;

   	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
-	spin_lock(&dev_priv->guc.host2guc_lock);

   	dev_priv->guc.action_count += 1;
   	dev_priv->guc.action_cmd = data[0];
@@ -119,7 +118,6 @@ static int host2guc_action(struct intel_guc *guc, u32 *data, u32 len)
   	}
   	dev_priv->guc.action_status = status;

-	spin_unlock(&dev_priv->guc.host2guc_lock);
   	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);

   	return ret;
@@ -292,16 +290,12 @@ static uint32_t select_doorbell_cacheline(struct intel_guc *guc)
   	const uint32_t cacheline_size = cache_line_size();
   	uint32_t offset;

-	spin_lock(&guc->host2guc_lock);
-
   	/* Doorbell uses a single cache line within a page */
   	offset = offset_in_page(guc->db_cacheline);

   	/* Moving to next cache line to reduce contention */
   	guc->db_cacheline += cacheline_size;

-	spin_unlock(&guc->host2guc_lock);
-
   	DRM_DEBUG_DRIVER("selected doorbell cacheline 0x%x, next 0x%x, linesize %u\n",
   			offset, guc->db_cacheline, cacheline_size);

@@ -322,13 +316,11 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)
   	const uint16_t end = start + half;
   	uint16_t id;

-	spin_lock(&guc->host2guc_lock);
   	id = find_next_zero_bit(guc->doorbell_bitmap, end, start);
   	if (id == end)
   		id = GUC_INVALID_DOORBELL_ID;
   	else
   		bitmap_set(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);

   	DRM_DEBUG_DRIVER("assigned %s priority doorbell id 0x%x\n",
   			hi_pri ? "high" : "normal", id);
@@ -338,9 +330,7 @@ static uint16_t assign_doorbell(struct intel_guc *guc, uint32_t priority)

   static void release_doorbell(struct intel_guc *guc, uint16_t id)
   {
-	spin_lock(&guc->host2guc_lock);
   	bitmap_clear(guc->doorbell_bitmap, id, 1);
-	spin_unlock(&guc->host2guc_lock);
   }

   /*
@@ -487,16 +477,13 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)
   	struct guc_process_desc *desc;
   	void *base;
   	u32 size = sizeof(struct guc_wq_item);
-	int ret = 0, timeout_counter = 200;
+	int ret = -ETIMEDOUT, timeout_counter = 200;

   	base = kmap_atomic(i915_gem_object_get_page(gc->client_obj, 0));
   	desc = base + gc->proc_desc_offset;

   	while (timeout_counter-- > 0) {
-		ret = wait_for_atomic(CIRC_SPACE(gc->wq_tail, desc->head,
-				gc->wq_size) >= size, 1);
-
-		if (!ret) {
+		if (CIRC_SPACE(gc->wq_tail, desc->head, gc->wq_size) >= size) {
   			*offset = gc->wq_tail;

   			/* advance the tail for next workqueue item */
@@ -505,7 +492,11 @@ static int guc_get_workqueue_space(struct i915_guc_client *gc, u32 *offset)

   			/* this will break the loop */
   			timeout_counter = 0;
+			ret = 0;
   		}
+
+		if (timeout_counter)
+			usleep_range(1000, 2000);

Sleeping is illegal while holding kmap_atomic mapping:

[   34.098798] BUG: scheduling while atomic: gem_close_race/1941/0x00000002
[   34.098822] Modules linked in: hid_generic usbhid i915 asix usbnet libphy mii i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm coretemp i2c_hid hid video pinctrl_sunrisepoint pinctrl_intel acpi_pad nls_iso8859_1 e1000e ptp psmouse pps_core ahci libahci
[   34.098824] CPU: 0 PID: 1941 Comm: gem_close_race Tainted: G     U          4.4.0-160121+ #123
[   34.098824] Hardware name: Intel Corporation Skylake Client platform/Skylake AIO DDR3L RVP10, BIOS SKLSE2R1.R00.X100.B01.1509220551 09/22/2015
[   34.098825]  0000000000013e40 ffff880166c27a78 ffffffff81280d02 ffff880172c13e40
[   34.098826]  ffff880166c27a88 ffffffff810c203a ffff880166c27ac8 ffffffff814ec808
[   34.098827]  ffff88016b7c6000 ffff880166c28000 00000000000f4240 0000000000000001
[   34.098827] Call Trace:
[   34.098831]  [<ffffffff81280d02>] dump_stack+0x4b/0x79
[   34.098833]  [<ffffffff810c203a>] __schedule_bug+0x41/0x4f
[   34.098834]  [<ffffffff814ec808>] __schedule+0x5a8/0x690
[   34.098835]  [<ffffffff814ec927>] schedule+0x37/0x80
[   34.098836]  [<ffffffff814ef3fd>] schedule_hrtimeout_range_clock+0xad/0x130
[   34.098837]  [<ffffffff81090be0>] ? hrtimer_init+0x10/0x10
[   34.098838]  [<ffffffff814ef3f1>] ? schedule_hrtimeout_range_clock+0xa1/0x130
[   34.098839]  [<ffffffff814ef48e>] schedule_hrtimeout_range+0xe/0x10
[   34.098840]  [<ffffffff814eef9b>] usleep_range+0x3b/0x40
[   34.098853]  [<ffffffffa01ec109>] i915_guc_wq_check_space+0x119/0x210 [i915]
[   34.098861]  [<ffffffffa01da47c>] intel_logical_ring_alloc_request_extras+0x5c/0x70 [i915]
[   34.098869]  [<ffffffffa01cdbf1>] i915_gem_request_alloc+0x91/0x170 [i915]
[   34.098875]  [<ffffffffa01c1c07>] i915_gem_do_execbuffer.isra.25+0xbc7/0x12a0 [i915]
[   34.098882]  [<ffffffffa01cb785>] ? i915_gem_object_get_pages_gtt+0x225/0x3c0 [i915]
[   34.098889]  [<ffffffffa01d1fb6>] ? i915_gem_pwrite_ioctl+0xd6/0x9f0 [i915]
[   34.098895]  [<ffffffffa01c2e68>] i915_gem_execbuffer2+0xa8/0x250 [i915]
[   34.098900]  [<ffffffffa00f65d8>] drm_ioctl+0x258/0x4f0 [drm]
[   34.098906]  [<ffffffffa01c2dc0>] ? i915_gem_execbuffer+0x340/0x340 [i915]
[   34.098908]  [<ffffffff8111590d>] do_vfs_ioctl+0x2cd/0x4a0
[   34.098909]  [<ffffffff8111eac2>] ? __fget+0x72/0xb0
[   34.098910]  [<ffffffff81115b1c>] SyS_ioctl+0x3c/0x70
[   34.098911]  [<ffffffff814effd7>] entry_SYSCALL_64_fastpath+0x12/0x6a
[   34.100208] ------------[ cut here ]------------

I don't know the code well enough to suggest a suitable fix.

Code really, really, really should be run with full debug tooling (and the
above is one of the most basic ones) enabled.

I think the best approach would be to:
- install GuC firmware on all CI machines. Please coordinate with
   Daniela&Tomi on this, and cc: intel-gfx-ci@xxxxxxxxxxxxxxxxx on any such
   mails.

- Every patch series that touches still disabled code needs to include a
   patch to enable GuC (or whatever it is) by default. Of course that patch
   must have a DONTMERGE or similar tag, if the feature isn't 100% ready
   for prime time yet. This way we can enlist CI to hunt for stuff like the
   above, and it's _really_ good at that.

Alex&Tvrtko, can you please take care of this?

Chris Harris has liaised with Tomi and apparently GuC firmware is 
already on the lab SKLs from ~two weeks ago.

Since the firmware is on all machines, presumably we would also want to 
have CI runs without the GuC enabled to ensure patches are not breaking 
the execlists mode?

How would you go about that? Submit patch series twice? Once with the 
final "enable GuC" patch and one without?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx