Re: [PATCH 1/6] dma-buf: add dynamic DMA-buf handling v13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 31.07.19 um 11:12 schrieb Daniel Vetter:
[SNIP]
I think I brought this up before, but new top-post for a clean start.

Use-case I have in mind is something like amdkfd's model, where you have a
list of buffers (per context or whatever) that you always need to have
present. Idea is to also use this for traditional CS for vk/gl, to cut
down on the buffer management overhead, but we'd still allow additional
buffers to be listed per-CS on top of that default working set.

This of course means no implicit sync anymore on these default buffers
(the point is to avoid touching every buffer on every CS, updating fences
would defeat that). That's why the CS can still list additional buffers,
the only reason for that is to add implicit sync fences. Those buffers
would be most likely in the default working set already.

Consequence is that I want the amdkfd model of "evict when needed, but
keep resident by default", but also working implicit fences. And it must
be doable without touching every bo on every CS. Listing possible
implementation options:

- the amdkfd trick doesn't work because it would break implicit fencing -
   any implicit sync would always result in the context getting
   preempted/evicted, which isn't great.

I'm actually working on re-working implicit fencing towards better supporting this.

Basic idea is that you split up the fences in a reservation object into three categories:
1. Implicit sync on write.
2. Implicit sync on read.
3. No implicit sync at all.

This should not only help the KFD, but also with amdgpu command submission and things like page tables updates.

E.g. we need to fences for page tables updates around in reservation objects as well, but you really really really don't want any implicit synchronization with them :)

I think that having a consensus of the meaning of the fences in a reservation object will be rather fundamental for what we are planning to do here.

- we share the resv_obj between all the buffers in the default working set
   of a context, with unsharing/resharing the resv_obj if they enter/leave
   the default working set. That way there's only one resv_obj to update on
   each CS, and we can attach a new shared fence for every CS. Trouble is
   that this means a given buffer can only be part of one default working
   set, so all shared buffers would need to be listed again separately. Not
   so great if userspace has to deal with that fairly arbitrary limitation.

Yeah, that is exactly what we do with the per VM BOs in amdgpu.

The limitation that you have only one working set actually turned out to be not a limitation at all, but rather seen as something welcomed by our Vulkan guys.

I also don't really see a way to have an implementation with good performance where BOs can be in multiple working sets at the same time.

- we allow the ->move_notify callback to add new fences, which the
   exporter needs to wait on before it schedules the pipelined move. This
   also avoids the per-BO update on every CS, and it would allow buffers to
   be shared and to be in multiple default working sets. The downside is
   that ->move_notify needs to be able to cope with added fences, which
   means we might need to grow the shared fences array, which might fail
   with ENOMEM. Not great. We could fix this with some kind of permanent
   shared fence slot reservations (i.e. a reserved slot which outlives
   holding the reservation lock), but that might waste quite a bit of
   memory. Probably not real problem in the grand scheme of things. I think
   the fence itself can be preallocated per context, so that isn't the
   problem.

Well the ENOMEM problem is the least of the problems with this approach. You can still block for the fence which you wanted to add.

The real problem is that you can't tell if a BO is busy or not by just looking at its current fences.

In other words when you stop adding fences you also want to stop moving them individually on the LRU.

When the memory management evicts one BO you essentially kick out a whole process/working set.

So when you want to kick out the next BO you actually want to do this for BOs which now became available anyway.

That approach won't work with the move_notify callback.

- same as above, but the new fence doesn't get added, but returned to the
   caller, and the exporter deals with the ENOMEM mess. Might not work
   since an importer could have a lot of contexts using a given object, and
   so would have a lot of fences to add.

I don't think that this will work.

See you not only need to be able to add the fence to the BO currently evicted, but also to all other BO in your process/working set.

Additional to that moving the ENOMEM handling from the importer to the exporter sounds as helpful as adding another layer of abstraction :)

Regards,
Christian.


- something entirely different?

Thoughts?

Cheers, Daniel

---
  drivers/dma-buf/dma-buf.c | 183 ++++++++++++++++++++++++++++++++++++--
  include/linux/dma-buf.h   | 108 ++++++++++++++++++++--
  2 files changed, 277 insertions(+), 14 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 6c15deb5d4ad..bd8611fa2cfa 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -531,6 +531,9 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
  		return ERR_PTR(-EINVAL);
  	}
+ if (WARN_ON(exp_info->ops->cache_sgt_mapping && exp_info->ops->pin))
+		return ERR_PTR(-EINVAL);
+
  	if (!try_module_get(exp_info->owner))
  		return ERR_PTR(-ENOENT);
@@ -651,10 +654,12 @@ void dma_buf_put(struct dma_buf *dmabuf)
  EXPORT_SYMBOL_GPL(dma_buf_put);
/**
- * dma_buf_attach - Add the device to dma_buf's attachments list; optionally,
+ * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list; optionally,
   * calls attach() of dma_buf_ops to allow device-specific attach functionality
- * @dmabuf:	[in]	buffer to attach device to.
- * @dev:	[in]	device to be attached.
+ * @dmabuf:		[in]	buffer to attach device to.
+ * @dev:		[in]	device to be attached.
+ * @importer_ops	[in]	importer operations for the attachment
+ * @importer_priv	[in]	importer private pointer for the attachment
   *
   * Returns struct dma_buf_attachment pointer for this attachment. Attachments
   * must be cleaned up by calling dma_buf_detach().
@@ -668,8 +673,10 @@ EXPORT_SYMBOL_GPL(dma_buf_put);
   * accessible to @dev, and cannot be moved to a more suitable place. This is
   * indicated with the error code -EBUSY.
   */
-struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
-					  struct device *dev)
+struct dma_buf_attachment *
+dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
+		       const struct dma_buf_attach_ops *importer_ops,
+		       void *importer_priv)
  {
  	struct dma_buf_attachment *attach;
  	int ret;
@@ -683,6 +690,8 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
attach->dev = dev;
  	attach->dmabuf = dmabuf;
+	attach->importer_ops = importer_ops;
+	attach->importer_priv = importer_priv;
mutex_lock(&dmabuf->lock); @@ -691,16 +700,72 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
  		if (ret)
  			goto err_attach;
  	}
+	reservation_object_lock(dmabuf->resv, NULL);
  	list_add(&attach->node, &dmabuf->attachments);
+	reservation_object_unlock(dmabuf->resv);
mutex_unlock(&dmabuf->lock); + /* When either the importer or the exporter can't handle dynamic
+	 * mappings we cache the mapping here to avoid issues with the
+	 * reservation object lock.
+	 */
+	if (dma_buf_attachment_is_dynamic(attach) !=
+	    dma_buf_is_dynamic(dmabuf)) {
+		struct sg_table *sgt;
+
+		if (dma_buf_is_dynamic(attach->dmabuf)) {
+			reservation_object_lock(attach->dmabuf->resv, NULL);
+			ret = dma_buf_pin(attach);
+			if (ret)
+				goto err_unlock;
+		}
+
+		sgt = dmabuf->ops->map_dma_buf(attach, DMA_BIDIRECTIONAL);
+		if (!sgt)
+			sgt = ERR_PTR(-ENOMEM);
+		if (IS_ERR(sgt)) {
+			ret = PTR_ERR(sgt);
+			goto err_unpin;
+		}
+		if (dma_buf_is_dynamic(attach->dmabuf))
+			reservation_object_unlock(attach->dmabuf->resv);
+		attach->sgt = sgt;
+		attach->dir = DMA_BIDIRECTIONAL;
+	}
+
  	return attach;
err_attach:
  	kfree(attach);
  	mutex_unlock(&dmabuf->lock);
  	return ERR_PTR(ret);
+
+err_unpin:
+	if (dma_buf_is_dynamic(attach->dmabuf))
+		dma_buf_unpin(attach);
+
+err_unlock:
+	if (dma_buf_is_dynamic(attach->dmabuf))
+		reservation_object_unlock(attach->dmabuf->resv);
+
+	dma_buf_detach(dmabuf, attach);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(dma_buf_dynamic_attach);
+
+/**
+ * dma_buf_attach - Wrapper for dma_buf_dynamic_attach
+ * @dmabuf:	[in]	buffer to attach device to.
+ * @dev:	[in]	device to be attached.
+ *
+ * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static
+ * mapping.
+ */
+struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
+					  struct device *dev)
+{
+	return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL);
  }
  EXPORT_SYMBOL_GPL(dma_buf_attach);
@@ -717,11 +782,22 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
  	if (WARN_ON(!dmabuf || !attach))
  		return;
- if (attach->sgt)
+	if (attach->sgt) {
+		if (dma_buf_is_dynamic(attach->dmabuf))
+			reservation_object_lock(attach->dmabuf->resv, NULL);
+
  		dmabuf->ops->unmap_dma_buf(attach, attach->sgt, attach->dir);
+ if (dma_buf_is_dynamic(attach->dmabuf)) {
+			dma_buf_unpin(attach);
+			reservation_object_unlock(attach->dmabuf->resv);
+		}
+	}
+
  	mutex_lock(&dmabuf->lock);
+	reservation_object_lock(dmabuf->resv, NULL);
  	list_del(&attach->node);
+	reservation_object_unlock(dmabuf->resv);
  	if (dmabuf->ops->detach)
  		dmabuf->ops->detach(dmabuf, attach);
@@ -730,6 +806,44 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attach)
  }
  EXPORT_SYMBOL_GPL(dma_buf_detach);
+/**
+ * dma_buf_pin - Lock down the DMA-buf
+ *
+ * @attach:	[in]	attachment which should be pinned
+ *
+ * Returns:
+ * 0 on success, negative error code on failure.
+ */
+int dma_buf_pin(struct dma_buf_attachment *attach)
+{
+	struct dma_buf *dmabuf = attach->dmabuf;
+	int ret = 0;
+
+	reservation_object_assert_held(dmabuf->resv);
+
+	if (dmabuf->ops->pin)
+		ret = dmabuf->ops->pin(attach);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(dma_buf_pin);
+
+/**
+ * dma_buf_unpin - Remove lock from DMA-buf
+ *
+ * @attach:	[in]	attachment which should be unpinned
+ */
+void dma_buf_unpin(struct dma_buf_attachment *attach)
+{
+	struct dma_buf *dmabuf = attach->dmabuf;
+
+	reservation_object_assert_held(dmabuf->resv);
+
+	if (dmabuf->ops->unpin)
+		dmabuf->ops->unpin(attach);
+}
+EXPORT_SYMBOL_GPL(dma_buf_unpin);
+
  /**
   * dma_buf_map_attachment - Returns the scatterlist table of the attachment;
   * mapped into _device_ address space. Is a wrapper for map_dma_buf() of the
@@ -749,6 +863,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
  					enum dma_data_direction direction)
  {
  	struct sg_table *sg_table;
+	int r;
might_sleep(); @@ -767,10 +882,31 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach,
  		return attach->sgt;
  	}
+ if (dma_buf_attachment_is_dynamic(attach)) {
+		reservation_object_assert_held(attach->dmabuf->resv);
+
+	} else if (dma_buf_is_dynamic(attach->dmabuf)) {
+		reservation_object_lock(attach->dmabuf->resv, NULL);
+		r = dma_buf_pin(attach);
+		if (r) {
+			reservation_object_unlock(attach->dmabuf->resv);
+			return ERR_PTR(r);
+		}
+	} else {
+		might_lock(&attach->dmabuf->resv->lock.base);
+	}
+
  	sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
  	if (!sg_table)
  		sg_table = ERR_PTR(-ENOMEM);
+ if (!dma_buf_attachment_is_dynamic(attach) &&
+	    dma_buf_is_dynamic(attach->dmabuf)) {
+		if (IS_ERR(sg_table))
+			dma_buf_unpin(attach);
+		reservation_object_unlock(attach->dmabuf->resv);
+	}
+
  	if (!IS_ERR(sg_table) && attach->dmabuf->ops->cache_sgt_mapping) {
  		attach->sgt = sg_table;
  		attach->dir = direction;
@@ -802,10 +938,43 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach,
  	if (attach->sgt == sg_table)
  		return;
+ if (dma_buf_attachment_is_dynamic(attach))
+		reservation_object_assert_held(attach->dmabuf->resv);
+	else if (dma_buf_is_dynamic(attach->dmabuf))
+		reservation_object_lock(attach->dmabuf->resv, NULL);
+	else
+		might_lock(&attach->dmabuf->resv->lock.base);
+
  	attach->dmabuf->ops->unmap_dma_buf(attach, sg_table, direction);
+
+	if (dma_buf_is_dynamic(attach->dmabuf) &&
+	    !dma_buf_attachment_is_dynamic(attach)) {
+		dma_buf_unpin(attach);
+		reservation_object_unlock(attach->dmabuf->resv);
+	}
  }
  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
+/**
+ * dma_buf_move_notify - notify attachments that DMA-buf is moving
+ *
+ * @dmabuf:	[in]	buffer which is moving
+ *
+ * Informs all attachmenst that they need to destroy and recreated all their
+ * mappings.
+ */
+void dma_buf_move_notify(struct dma_buf *dmabuf)
+{
+	struct dma_buf_attachment *attach;
+
+	reservation_object_assert_held(dmabuf->resv);
+
+	list_for_each_entry(attach, &dmabuf->attachments, node)
+		if (attach->importer_ops && attach->importer_ops->move_notify)
+			attach->importer_ops->move_notify(attach);
+}
+EXPORT_SYMBOL_GPL(dma_buf_move_notify);
+
  /**
   * DOC: cpu access
   *
@@ -1225,10 +1394,12 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused)
  		seq_puts(s, "\tAttached Devices:\n");
  		attach_count = 0;
+ reservation_object_lock(buf_obj->resv, NULL);
  		list_for_each_entry(attach_obj, &buf_obj->attachments, node) {
  			seq_printf(s, "\t%s\n", dev_name(attach_obj->dev));
  			attach_count++;
  		}
+		reservation_object_unlock(buf_obj->resv);
seq_printf(s, "Total %d devices attached\n\n",
  				attach_count);
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 01ad5b942a6f..ccad2fc1f640 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -92,14 +92,40 @@ struct dma_buf_ops {
  	 */
  	void (*detach)(struct dma_buf *, struct dma_buf_attachment *);
+ /**
+	 * @pin:
+	 *
+	 * This is called by dma_buf_pin and lets the exporter know that the
+	 * DMA-buf can't be moved any more.
+	 *
+	 * This is called with the dmabuf->resv object locked.
+	 *
+	 * This callback is optional.
+	 *
+	 * Returns:
+	 *
+	 * 0 on success, negative error code on failure.
+	 */
+	int (*pin)(struct dma_buf_attachment *attach);
+
+	/**
+	 * @unpin:
+	 *
+	 * This is called by dma_buf_unpin and lets the exporter know that the
+	 * DMA-buf can be moved again.
+	 *
+	 * This is called with the dmabuf->resv object locked.
+	 *
+	 * This callback is optional.
+	 */
+	void (*unpin)(struct dma_buf_attachment *attach);
+
  	/**
  	 * @map_dma_buf:
  	 *
  	 * This is called by dma_buf_map_attachment() and is used to map a
  	 * shared &dma_buf into device address space, and it is mandatory. It
-	 * can only be called if @attach has been called successfully. This
-	 * essentially pins the DMA buffer into place, and it cannot be moved
-	 * any more
+	 * can only be called if @attach has been called successfully.
  	 *
  	 * This call may sleep, e.g. when the backing storage first needs to be
  	 * allocated, or moved to a location suitable for all currently attached
@@ -120,6 +146,9 @@ struct dma_buf_ops {
  	 * any other kind of sharing that the exporter might wish to make
  	 * available to buffer-users.
  	 *
+	 * This is always called with the dmabuf->resv object locked when
+	 * the pin/unpin callbacks are implemented.
+	 *
  	 * Returns:
  	 *
  	 * A &sg_table scatter list of or the backing storage of the DMA buffer,
@@ -137,9 +166,6 @@ struct dma_buf_ops {
  	 *
  	 * This is called by dma_buf_unmap_attachment() and should unmap and
  	 * release the &sg_table allocated in @map_dma_buf, and it is mandatory.
-	 * It should also unpin the backing storage if this is the last mapping
-	 * of the DMA buffer, it the exporter supports backing storage
-	 * migration.
  	 */
  	void (*unmap_dma_buf)(struct dma_buf_attachment *,
  			      struct sg_table *,
@@ -330,6 +356,34 @@ struct dma_buf {
  	} cb_excl, cb_shared;
  };
+/**
+ * struct dma_buf_attach_ops - importer operations for an attachment
+ * @move_notify: [optional] notification that the DMA-buf is moving
+ *
+ * Attachment operations implemented by the importer.
+ */
+struct dma_buf_attach_ops {
+	/**
+	 * @move_notify
+	 *
+	 * If this callback is provided the framework can avoid pinning the
+	 * backing store while mappings exists.
+	 *
+	 * This callback is called with the lock of the reservation object
+	 * associated with the dma_buf held and the mapping function must be
+	 * called with this lock held as well. This makes sure that no mapping
+	 * is created concurrently with an ongoing move operation.
+	 *
+	 * Mappings stay valid and are not directly affected by this callback.
+	 * But the DMA-buf can now be in a different physical location, so all
+	 * mappings should be destroyed and re-created as soon as possible.
+	 *
+	 * New mappings can be created after this callback returns, and will
+	 * point to the new location of the DMA-buf.
+	 */
+	void (*move_notify)(struct dma_buf_attachment *attach);
+};
+
  /**
   * struct dma_buf_attachment - holds device-buffer attachment data
   * @dmabuf: buffer for this attachment.
@@ -338,6 +392,8 @@ struct dma_buf {
   * @sgt: cached mapping.
   * @dir: direction of cached mapping.
   * @priv: exporter specific attachment data.
+ * @importer_ops: importer operations for this attachment.
+ * @importer_priv: importer specific attachment data.
   *
   * This structure holds the attachment information between the dma_buf buffer
   * and its user device(s). The list contains one attachment struct per device
@@ -355,6 +411,9 @@ struct dma_buf_attachment {
  	struct sg_table *sgt;
  	enum dma_data_direction dir;
  	void *priv;
+
+	const struct dma_buf_attach_ops *importer_ops;
+	void *importer_priv;
  };
/**
@@ -405,10 +464,42 @@ static inline void get_dma_buf(struct dma_buf *dmabuf)
  	get_file(dmabuf->file);
  }
+/**
+ * dma_buf_is_dynamic - check if a DMA-buf uses dynamic mappings.
+ * @dmabuf: the DMA-buf to check
+ *
+ * Returns true if a DMA-buf exporter wants to create dynamic sg table mappings
+ * for each attachment. False if only a single static sg table should be used.
+ */
+static inline bool dma_buf_is_dynamic(struct dma_buf *dmabuf)
+{
+	return !!dmabuf->ops->pin;
+}
+
+/**
+ * dma_buf_attachment_is_dynamic - check if a DMA-buf attachment uses dynamic
+ * mappinsg
+ * @attach: the DMA-buf attachment to check
+ *
+ * Returns true if a DMA-buf importer wants to use dynamic sg table mappings and
+ * calls the map/unmap functions with the reservation object locked.
+ */
+static inline bool
+dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach)
+{
+	return attach->importer_ops && attach->importer_ops->move_notify;
+}
+
  struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf,
-							struct device *dev);
+					  struct device *dev);
+struct dma_buf_attachment *
+dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev,
+		       const struct dma_buf_attach_ops *importer_ops,
+		       void *importer_priv);
  void dma_buf_detach(struct dma_buf *dmabuf,
-				struct dma_buf_attachment *dmabuf_attach);
+		    struct dma_buf_attachment *attach);
+int dma_buf_pin(struct dma_buf_attachment *attach);
+void dma_buf_unpin(struct dma_buf_attachment *attach);
struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info); @@ -420,6 +511,7 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *,
  					enum dma_data_direction);
  void dma_buf_unmap_attachment(struct dma_buf_attachment *, struct sg_table *,
  				enum dma_data_direction);
+void dma_buf_move_notify(struct dma_buf *dma_buf);
  int dma_buf_begin_cpu_access(struct dma_buf *dma_buf,
  			     enum dma_data_direction dir);
  int dma_buf_end_cpu_access(struct dma_buf *dma_buf,
--
2.17.1

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux