Re: [PATCH] drm/i915: Trim the object sg table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/11/2016 14:44, Chris Wilson wrote:
On Wed, Nov 09, 2016 at 02:30:02PM +0000, Tvrtko Ursulin wrote:
From: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>

At the moment we allocate enough sg table entries assuming we
will not be able to do any coallescing. But since in practice
we most often can, and more so very effectively, this ends up
wasting a lot of memory.

A simple and effective way of trimming the over-allocated
entries is to copy the table over to a new one allocated to the
exact size.

Experiment on my freshly logged and idle desktop (KDE) showed
Experiments
that by doing this we can save approximately 1 MiB of RAM, or
when running a typical benchmark like gl_manhattan I have
even seen a 6 MiB saving.

More complicated techniques such as only copying the last used page and
freeing the rest are left to the reader.

Yes that would need to go into the core kernel since it needs access to static alloc/free functions and performance benefit might be quite small for that. Typically I see coalescing working really well so the delta in saved allocations and frees would be quite small. Perhaps I need to attempt to fragment my memory a lot to see what happens then.

v2:
 * Update commit message.
 * Use temporary sg_table on stack. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
---
 drivers/gpu/drm/i915/i915_gem.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d2ad73d0b5b9..411aae535abe 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2232,6 +2232,28 @@ static unsigned int swiotlb_max_size(void)
 #endif
 }

+static void i915_sg_trim(struct sg_table *orig_st)
+{
+	struct sg_table new_st;
+	struct scatterlist *sg, *new_sg;
+	unsigned int i;
+
+	if (orig_st->nents == orig_st->orig_nents)
+		return;
+
+	if (sg_alloc_table(&new_st, orig_st->nents, GFP_KERNEL))
+		return;
+
+	new_sg = new_st.sgl;
+	for_each_sg(orig_st->sgl, sg, orig_st->nents, i) {
+		sg_set_page(new_sg, sg_page(sg), sg->length, 0);
+		new_sg = sg_next(new_sg);

Worth a
	/* called before being DMA mapped, no need to copy sg->dma_* */
?

Hm, or something safer than a comment. Unfortunately entries are not zeroed by default to enable a GEM_BUG_ON here. Unless CONFIG_GEM_DEBUG could mean GFP_ZERO added to some our allocations. :)

Yeah I think comment is the best option as long as this function is static only. Will add.


+	}
+
+	sg_free_table(orig_st);
+	memcpy(orig_st, &new_st, sizeof(*orig_st));

I would have used *orig_st = new;

It is more readable, agreed.


Reviewed-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
           ^ I remembered it this time!
Took a couple of attempts to spell my name right though.

Thanks! I assume I can keep it for the above little changes.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux