Re: [PATCH 2/2] commit: use a priority queue in merge base functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 30, 2012 at 09:23:25AM -0700, Junio C Hamano wrote:

> Jeff King <peff@xxxxxxxx> writes:
> 
> > Anyway, since this isn't yielding any performance benefit, I'm not going
> > to go down that route. But stability of the queue is something that we
> > need to consider if we ever do replace commit_list with a different data
> > structure.
> >
> > Here's the patch to make the existing priority queue stable (by wasting
> > space) in case we ever want to use it.
> 
> Thanks for being thorough and showing a good analysis.
> 
> If we want stability, the space to hold insertion counter is not
> "wasted", by the way.

It is if there is another data structure that will do the same job with
the same performance characteristics and without using that space.

But I'm not even sure it is an issue in practice. There are ~320K
objects in linux-2.6. Even if we inserted _all_ of them at once into a
queue (which we don't; it's usually more like a couple dozen, with a few
pathological cases being equal to the number of refs we have, which is
usually in the hundreds), then we are talking about an extra 2.5M on a
64-bit system. Compared to dropping an O(n^2) operation to O(lg n),
that's probably not even going to be noticeable.

> I actually think the generic "queue" implementation may want to
> handle elements that are not just single pointers.  Just like
> qsort() is not about sorting an array of pointers but it is about
> sorting an array of elements of caller specified size, perhaps
> "queue" can hold elements of size the caller specified (set in stone
> when the queue is initialized).
> 
> Then, a caller that wants a stable priority queue of commits can
> tell the queue to manage "struct { struct commit *c; int gen; }",
> use the "gen" field for stability in its comparison callback, etc.,
> while a caller that does not need stability can tell the queue to
> manage "struct commit *".  That way, the generic queue layer does
> not have to worry about wasting the insertion counter space, no?

Yeah, that would be more generic. One wonky thing I didn't point out in
my implementation is this:

> +static inline int queue_cmp(struct queue *pq, unsigned i, unsigned j)
> +{
> +       int cmp = pq->cmp(pq->items[i].item, pq->items[j].item);
> +       if (cmp)
> +               return cmp;
> +       if (pq->items[i].counter < pq->items[j].counter)
> +               return 1;
> +       return -1;
> +}

Notice how the counter comparison is "backwards" to the regular
comparison. That's because the queue is actually a max-queue (I did it
that way since we are indexing on timestamp, and want to go in reverse
chronological order). But the counter part wants to prioritize minimums,
so you have to reverse it. You could also implement a min-queue and ask
the caller to reverse their comparison function.

But really, a caller could theoretically want to prioritize in either
direction (e.g., giving a LIFO behavior to elements with the same
priority). Pulling the counter out of the queue would allow that.

The reason I didn't, though, is that it would make the interface a huge
pain if the caller had to do this:

  struct stable_commit_queue_element qe;
  qe.commit = commit;
  qe.counter = counter++; /* who is holding the global counter? */
  queue_push(&q, &qe);

instead of:

  queue_push(&q, commit);

I suspect you could build a "queue" object, and then implement a
"stable queue" on top of it with some clever use of function pointers
for the comparison function.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]