On Tue, May 01 2018, Derrick Stolee wrote: > On 4/30/2018 6:07 PM, Ævar Arnfjörð Bjarmason wrote: >> Since we show the commit data in the output that's nicely aligned once >> we sort by object type. The decision to show tags before commits is >> pretty arbitrary, but it's much less likely that we'll display a tag, >> so if there is one it makes sense to show it first. > > Here's a non-arbitrary reason: the object types are ordered > topologically (ignoring self-references): > > tag -> commit, tree, blob > commit -> tree > tree -> blob Thanks. I'll add a patch with that comment to v2. >> @@ -421,7 +451,12 @@ static int get_short_oid(const char *name, int len, struct object_id *oid, >> ds.fn = NULL; >> advise(_("The candidates are:")); >> - for_each_abbrev(ds.hex_pfx, show_ambiguous_object, &ds); >> + for_each_abbrev(ds.hex_pfx, collect_ambiguous, &collect); >> + QSORT(collect.oid, collect.nr, sort_ambiguous); > > I was wondering how the old code sorted by SHA even when the ambiguous > objects were loaded from different sources (multiple pack-files, loose > objects). Turns out that for_each_abbrev() does its own sort after > collecting the SHAs and then calls the given function pointer only > once per distinct object. This avoids multiple instances of the same > object, which may appear multiple times across pack-files. > > I only ask because now we are doing two sorts. I wonder if it would be > more elegant to provide your sorting algorithm to for_each_abbrev() > and let it call show_ambiguous_object as before. > > Another question is if we should use this sort generally for all calls > to for_each_abbrev(). The only other case I see is in > builtin/revparse.c. When preparing v2 I realized how confusing this was, so I'd added this to the commit message of my WIP re-roll which should explain this: A note on the implementation: I started out with something much simpler which just replaced oid_array_sort() in sha1-array.c with a custom sort function before calling oid_array_for_each_unique(). But then dumbly noticed that it doesn't work because the output function was tangled up with the code added in fad6b9e590 ("for_each_abbrev: drop duplicate objects", 2016-09-26) to ensure we don't display duplicate objects. That's why we're doing two passes here, first we need to sort the list and de-duplicate the objects, then sort them in our custom order, and finally output them without re-sorting them. I suppose we could also make oid_array_for_each_unique() maintain a hashmap of emitted objects, but that would increase its memory profile and wouldn't be worth the complexity for this one-off use-case, oid_array_for_each_unique() is used in many other places.