Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 22/03/18 09:32, Jeff King wrote:
> On Wed, Mar 21, 2018 at 04:59:19PM +0100, Duy Nguyen wrote:
> 
>>> I hate to be a wet blanket, but am I the only one who is wondering
>>> whether the tradeoffs is worth it? 8% memory reduction doesn't seem
>>> mind-bogglingly good,
>>
>> AEvar measured RSS. If we count objects[] array alone, the saving is
>> 40% (136 bytes per entry down to 80). Some is probably eaten up by
>> mmap in rss.
> 
> Measuring actual heap usage with massif, I get before/after peak heaps
> of 1728 and 1346MB respectively when repacking linux.git. So that's ~22%
> savings overall.
> 
> Of the used heap after your patches:
> 
>  - ~40% of that is from packlist_alloc()
>  - ~17% goes to "struct object"
>  - ~10% for the object.c hash table to store all the "struct object"
>  - ~7% goes to the delta cache
>  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
>        there, too; I think our peak is when we're sorting the revindex
>        and have to keep two copies in memory at once)

which begs the question, how much slower would it be if we
replaced the radix-sort with an in-place sort (e.g. heapsort).

I hacked up the patch below, just for fun. I don't have any
large repos (or enough disk space) to do any meaningful perf
tests, but I did at least compile it and it passes the test-suite.
(That is no guarantee that I haven't introduced bugs, of course!)

;-)

ATB,
Ramsay Jones

-- >8 --
Subject: [PATCH] pack-revindex: replace radix-sort with in-place heapsort

Signed-off-by: Ramsay Jones <ramsay@xxxxxxxxxxxxxxxxxxxx>
---
 pack-revindex.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/pack-revindex.c b/pack-revindex.c
index ff5f62c03..16f17eac1 100644
--- a/pack-revindex.c
+++ b/pack-revindex.c
@@ -15,6 +15,7 @@
  * get the object sha1 from the main index.
  */
 
+#ifdef DUMMY
 /*
  * This is a least-significant-digit radix sort.
  *
@@ -112,6 +113,65 @@ static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max)
 #undef BUCKETS
 #undef DIGIT_SIZE
 }
+#endif
+
+static inline void swap(struct revindex_entry *a, int i, int j)
+{
+	struct revindex_entry t;
+
+	t = a[i];
+	a[i] = a[j];
+	a[j] = t;
+}
+
+/*
+ * assume that elements first .. last (array index first-1 .. last-1) obey
+ * the partially ordered tree property, except possibly for the children of
+ * the first element. push down the first element until the partially
+ * ordered tree property is restored.
+ */
+static void push_down(struct revindex_entry *a, int first, int last)
+{
+	int parent = first;
+	int last_node = last / 2;
+
+	while (parent <= last_node) {
+
+		int left = 2 * parent;
+		int right = left + 1;
+		int biggest;
+
+		if (right > last) /* parent only has one child */
+			biggest = left;
+		else {
+			if (a[left-1].offset >= a[right-1].offset)
+				biggest = left;
+			else
+				biggest = right;
+
+		}
+
+		if (a[parent-1].offset >= a[biggest-1].offset)
+			break; /* partially ordered tree property, we're done */
+
+		/* push parent down */
+		swap(a, parent-1, biggest-1);
+		parent = biggest;
+	}
+}
+
+static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max)
+{
+	int i;
+
+	for (i = n/2; i > 0; i--)
+		push_down(entries, i, n);
+
+	for (i = n; i > 1; i--) {
+		swap(entries, 0, i-1);
+		push_down(entries, 1, i-1);
+	}
+}
 
 /*
  * Ordered list of offsets of objects in the pack.
-- 
2.16.0




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux