Re: [PATCH] drm/i915: Do not leak VMAs (and PPGTT VMs) of imported flinked objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 04/20/2015 01:36 PM, Chris Wilson wrote:
On Mon, Apr 20, 2015 at 01:14:34PM +0100, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

If a client instantiates a VMA against an imported object (flink) this VMA
will not get unbound when the object is closed.

This happens because the exporter holds a reference count on the object and
will also keep a reference to the PPGTT VM.

In real life this happens with xorg-driver-intel and fbcon takeover. Latter
is copied from via the flink name and when Xorg process exists one VMA
remains dangling with a now unreachable VM.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Testcase: igt/gem_ppgtt/flink-vs-ctx-vm-leak
---
  drivers/gpu/drm/i915/i915_drv.c |  1 +
  drivers/gpu/drm/i915/i915_drv.h |  2 ++
  drivers/gpu/drm/i915/i915_gem.c | 63 ++++++++++++++++++++++++++++++++---------
  3 files changed, 53 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index f9754c3..16a0b34 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1630,6 +1630,7 @@ static struct drm_driver driver = {
  	.debugfs_init = i915_debugfs_init,
  	.debugfs_cleanup = i915_debugfs_cleanup,
  #endif
+	.gem_close_object = i915_gem_close_object,
  	.gem_free_object = i915_gem_free_object,
  	.gem_vm_ops = &i915_gem_vm_ops,

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 6a2528c..e82790b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2635,6 +2635,8 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
  						  size_t size);
  void i915_init_vm(struct drm_i915_private *dev_priv,
  		  struct i915_address_space *vm);
+void i915_gem_close_object(struct drm_gem_object *gem_obj,
+			   struct drm_file *file);
  void i915_gem_free_object(struct drm_gem_object *obj);
  void i915_gem_vma_destroy(struct i915_vma *vma);

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f7b8766..a720154 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4523,6 +4523,55 @@ static bool discard_backing_storage(struct drm_i915_gem_object *obj)
  	return atomic_long_read(&obj->base.filp->f_count) == 1;
  }

+static void i915_gem_unbind_vma(struct drm_i915_private *dev_priv,
+				struct i915_vma *vma)
+{
+	if (WARN_ON(i915_vma_unbind(vma) == -ERESTARTSYS)) {
+		bool was_interruptible;
+
+		was_interruptible = dev_priv->mm.interruptible;
+		dev_priv->mm.interruptible = false;
+
+		WARN_ON(i915_vma_unbind(vma));
+
+		dev_priv->mm.interruptible = was_interruptible;
+	}
+}
+
+void i915_gem_close_object(struct drm_gem_object *gem_obj,
+			   struct drm_file *file)
+{
+	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct i915_vma *vma, *next;
+	struct i915_hw_ppgtt *ppgtt;
+
+	mutex_lock(&dev->struct_mutex);
+
+	/*
+	 * Release all VMAs associated with this client's PPGTT.
+	 *
+	 * This is to avoid potentially unreachable VMAs since contexts can have
+	 * shorter lifetime than objects. Meaning if a client has a reference to
+	 * an object (flink) and an instantiated VMA, when it exists neither VMA
+	 * will be unbound (since object free won't run), nor the PPGTT VM
+	 * freed (since VMA holds a reference to it).
+	 */
+	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+		if (i915_is_ggtt(vma->vm))
+			continue;
+
+		ppgtt = (struct i915_hw_ppgtt *)vma->vm;
+		if (ppgtt->file_priv != file_priv)
+			continue;
+
+		i915_gem_unbind_vma(dev->dev_private, vma);

No we can't do this, as it makes close sync and so can have disasterous
effects on performance (though mitigated chiefly by userspace
agressively caching bo) and also the unbind is very likely to fail,
though admittedly the fbcon copy should be before X starts (ab)using
signals - hence nasty WARN_ON.

Plus also walking this linear list is quite painful in certain abusive
tests. My preference for fixing this bug would be via vma active
references and auto-unbinding on retirement after a close.

Why this is any different today with GEM_CLOSE having pretty similar VMA unbind loop? Is the typical (and interesting) case not that GEM_CLOSE will trigger gem_object_close and gem_object_free at the same invocation?

Expect some fun patches in your inbox Tvrtko :-p

Ok, just don't make them depend on everything. :)

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux