On Thu, Oct 27, 2011 at 5:26 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Dan McGee <dpmcgee@xxxxxxxxx> writes: > >> Two optimizations take place here- we can start our objects array >> iteration from a known point where we left off before we started trying >> to find our tags, > > This I would understand (but I am somewhat curious how much last_untagged > would advance relative to nr_objects for this half of the optimization to > be worth it), but... First off, sorry I wasn't able to respond to you until now, I've been a bit swamped with other projects. I saw this made it into the 1.7.7.3 maint release today, but wanted to at least try to respond to the points you raised (if you didn't investigate them by yourself already). If I remember right, the last_untagged optimization was pretty minor- around 10% advancement of the pointer, never that much more. However, it was a very easy change to make so I figured it was worth the slight additional code (~5 lines changed for that part on its own). > >> and we don't need to do the deep dives required by >> add_family_to_write_order() if the object has already been marked as >> filled. > > I am not sure if this produces the identical result that was benchmarked > in the original series. I was not either when I wrote the patch, and I had hoped to confirm the results you showed in the message of 1b4bb16b9ec. However, I was unable to figure out how you generated those numbers so I wasn't able to do so (and had planned to get back to you to find out how you made those tables). Were you able to verify the ordering did not regress? > For example, if you have a tagged object that is not a commit (say a > blob), you would have written that blob in the second phase (write tagged > objects together), so the family of blobs that share same delta parent as > that blob will not be written in this "Finally all the rest" in the right > place in the original list, no? True, I think. They would not be written in the same place, but is that necessarily the right place? The delta parent of an already-filled tag object would eventually come up in the array as not filled, and then we would do a full deep dive at that point, so the majority of the objects would still be in close proximity. Note that either way, we still have a gap between the original tagged object and its "family"- potentially all the other tagged tips, tags, commits, and trees before the blobs are finally hit and laid out in family groups. So there is at least one potentially big seek that can't be avoided. > I do not think this change would forget to fill an object that needs to be > filled, but it would affect the resulting ordering of the list, so... This was the purpose of the added "if (wo_end != nr_objects) die()" line; it confirms we have traversed and hit every object we originally found. -Dan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html