[PATCH 03/12] reftable/merged: advance subiter on subsequent iteration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When advancing the merged iterator, we pop the top-most entry from its
priority queue and then advance the sub-iterator that the entry belongs
to, adding the result as a new entry. This is quite sensible in the case
where the merged iterator is used to actual iterate through records. But
the merged iterator is also used when we look up a single record, only,
so advancing the sub-iterator is wasted effort because we would never
even look at the result.

Instead of immediately advancing the sub-iterator, we can also defer
this to the next iteration of the merged iterator by storing the
intent-to-advance. This results in a small speedup when reading many
records. The following benchmark creates 10000 refs, which will also end
up with many ref lookups:

    Benchmark 1: update-ref: create many refs (revision = HEAD~)
      Time (mean ± σ):     337.2 ms ±   7.3 ms    [User: 200.1 ms, System: 136.9 ms]
      Range (min … max):   329.3 ms … 373.2 ms    100 runs

    Benchmark 2: update-ref: create many refs (revision = HEAD)
      Time (mean ± σ):     332.5 ms ±   5.9 ms    [User: 197.2 ms, System: 135.1 ms]
      Range (min … max):   327.6 ms … 359.8 ms    100 runs

    Summary
      update-ref: create many refs (revision = HEAD) ran
        1.01 ± 0.03 times faster than update-ref: create many refs (revision = HEAD~)

While this speedup alone isn't really worth it, this refactoring will
also allow two additional optimizations in subsequent patches. First, it
will allow us to special-case when there is only a single sub-iter left
to circumvent the priority queue altogether. And second, it makes it
easier to avoid copying records to the caller.

Signed-off-by: Patrick Steinhardt <ps@xxxxxx>
---
 reftable/merged.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/reftable/merged.c b/reftable/merged.c
index 12ebd732e8..9b1ccfff00 100644
--- a/reftable/merged.c
+++ b/reftable/merged.c
@@ -19,11 +19,12 @@ license that can be found in the LICENSE file or at
 
 struct merged_iter {
 	struct reftable_iterator *stack;
+	struct merged_iter_pqueue pq;
 	uint32_t hash_id;
 	size_t stack_len;
 	uint8_t typ;
 	int suppress_deletions;
-	struct merged_iter_pqueue pq;
+	ssize_t advance_index;
 };
 
 static int merged_iter_init(struct merged_iter *mi)
@@ -96,13 +97,17 @@ static int merged_iter_next_entry(struct merged_iter *mi,
 	struct pq_entry entry = { 0 };
 	int err = 0;
 
+	if (mi->advance_index >= 0) {
+		err = merged_iter_advance_subiter(mi, mi->advance_index);
+		if (err < 0)
+			return err;
+		mi->advance_index = -1;
+	}
+
 	if (merged_iter_pqueue_is_empty(mi->pq))
 		return 1;
 
 	entry = merged_iter_pqueue_remove(&mi->pq);
-	err = merged_iter_advance_subiter(mi, entry.index);
-	if (err < 0)
-		return err;
 
 	/*
 	  One can also use reftable as datacenter-local storage, where the ref
@@ -116,14 +121,6 @@ static int merged_iter_next_entry(struct merged_iter *mi,
 		struct pq_entry top = merged_iter_pqueue_top(mi->pq);
 		int cmp;
 
-		/*
-		 * When the next entry comes from the same queue as the current
-		 * entry then it must by definition be larger. This avoids a
-		 * comparison in the most common case.
-		 */
-		if (top.index == entry.index)
-			break;
-
 		cmp = reftable_record_cmp(&top.rec, &entry.rec);
 		if (cmp > 0)
 			break;
@@ -137,6 +134,7 @@ static int merged_iter_next_entry(struct merged_iter *mi,
 
 	reftable_record_release(rec);
 	*rec = entry.rec;
+	mi->advance_index = entry.index;
 
 done:
 	if (err)
@@ -160,9 +158,8 @@ static int merged_iter_next(struct merged_iter *mi, struct reftable_record *rec)
 static int merged_iter_next_void(void *p, struct reftable_record *rec)
 {
 	struct merged_iter *mi = p;
-	if (merged_iter_pqueue_is_empty(mi->pq))
+	if (merged_iter_pqueue_is_empty(mi->pq) && mi->advance_index < 0)
 		return 1;
-
 	return merged_iter_next(mi, rec);
 }
 
@@ -255,6 +252,7 @@ static int merged_table_seek_record(struct reftable_merged_table *mt,
 		.typ = reftable_record_type(rec),
 		.hash_id = mt->hash_id,
 		.suppress_deletions = mt->suppress_deletions,
+		.advance_index = -1,
 	};
 	struct merged_iter *p;
 	int err;
-- 
2.43.GIT

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux