Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > From what I understand of GGG I think you updated only the summary on > the PR but not the commit itself, the latter is what would go into > git.git. Here the commit should be updated so we get this message. Ah, I have long learned to ignore the blurb after the three-dash line when a patch comes via GGG (because the PR text is often useless and most often people seem to use text identical to the commit log message). I didn't realize that the proposed commit log message was useless in this particular patch. I was wondering why you were talking about an extra blank line after the title there ;-) > The part after the "---" is usually just used in this project for ad-hoc > list-only comment. True. "What I'm not sure about is ..." however does belong to the list-only comment, though. If it is not > The difference between these two APIs is thaht oidset is hash-backed, > and you'd insert into it and we de-duplicate any duplicate OIDs on-the-fly. > > The oid_array is just an realloc()'d "struct object_id *oid". On > insertion you can insert duplicates, but it has the ability to track > "I've sorted these", and "let's iterate over this sorted, and de-dup any > duplicates". > > We have the two APIs for a reason, but I don't know in any of these > cases whether this change is safe. > > Does e.g. index-pack.c always receive de-duplicated OIDs and we were > wasting CPU cycles using an oidset? > > Do some of these like pack-objects.c receive de-duplicated OIDs from > e.g. "git repack" *now*, but we just lack test coverage to see that > they're happy to get duplicate OIDs on stdin (e.g. manually from a > user), and this would introduce a bug? Also, if oid_array is used to produce a de-duplicated list of object names in the current code, it is very likely that oid_array is sorted (perhaps the objects are fed in sorted order), and the callers depend on the order of the objects they find in the array. Throwing sorted list of object names at oidset and then iterating over what is in the oidset would likely to destroy the original ordering. I do not offhand know if the callers are broken by such a change (either correctness-wise or performance-wise). > But most importantly is it worth it? What's the rationale for the > change? Less CPU/memory use? Getting e.g. "hyperfine" or "/usr/bin/time > -v" output for those (if so) would be valuable. Thanks.