On Thu, Aug 30, 2012 at 09:23:25AM -0700, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > Anyway, since this isn't yielding any performance benefit, I'm not going > > to go down that route. But stability of the queue is something that we > > need to consider if we ever do replace commit_list with a different data > > structure. > > > > Here's the patch to make the existing priority queue stable (by wasting > > space) in case we ever want to use it. > > Thanks for being thorough and showing a good analysis. > > If we want stability, the space to hold insertion counter is not > "wasted", by the way. It is if there is another data structure that will do the same job with the same performance characteristics and without using that space. But I'm not even sure it is an issue in practice. There are ~320K objects in linux-2.6. Even if we inserted _all_ of them at once into a queue (which we don't; it's usually more like a couple dozen, with a few pathological cases being equal to the number of refs we have, which is usually in the hundreds), then we are talking about an extra 2.5M on a 64-bit system. Compared to dropping an O(n^2) operation to O(lg n), that's probably not even going to be noticeable. > I actually think the generic "queue" implementation may want to > handle elements that are not just single pointers. Just like > qsort() is not about sorting an array of pointers but it is about > sorting an array of elements of caller specified size, perhaps > "queue" can hold elements of size the caller specified (set in stone > when the queue is initialized). > > Then, a caller that wants a stable priority queue of commits can > tell the queue to manage "struct { struct commit *c; int gen; }", > use the "gen" field for stability in its comparison callback, etc., > while a caller that does not need stability can tell the queue to > manage "struct commit *". That way, the generic queue layer does > not have to worry about wasting the insertion counter space, no? Yeah, that would be more generic. One wonky thing I didn't point out in my implementation is this: > +static inline int queue_cmp(struct queue *pq, unsigned i, unsigned j) > +{ > + int cmp = pq->cmp(pq->items[i].item, pq->items[j].item); > + if (cmp) > + return cmp; > + if (pq->items[i].counter < pq->items[j].counter) > + return 1; > + return -1; > +} Notice how the counter comparison is "backwards" to the regular comparison. That's because the queue is actually a max-queue (I did it that way since we are indexing on timestamp, and want to go in reverse chronological order). But the counter part wants to prioritize minimums, so you have to reverse it. You could also implement a min-queue and ask the caller to reverse their comparison function. But really, a caller could theoretically want to prioritize in either direction (e.g., giving a LIFO behavior to elements with the same priority). Pulling the counter out of the queue would allow that. The reason I didn't, though, is that it would make the interface a huge pain if the caller had to do this: struct stable_commit_queue_element qe; qe.commit = commit; qe.counter = counter++; /* who is holding the global counter? */ queue_push(&q, &qe); instead of: queue_push(&q, commit); I suspect you could build a "queue" object, and then implement a "stable queue" on top of it with some clever use of function pointers for the comparison function. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html