On Wed, Jun 16, 2021 at 10:04 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > "Elijah Newren via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > + /* Ignore clean entries */ > > + if (ci->merged.clean) > > + continue; > > + > > + /* Ignore entries that don't need a content merge */ > > + if (ci->match_mask || ci->filemask < 6 || > > + !S_ISREG(ci->stages[1].mode) || > > + !S_ISREG(ci->stages[2].mode) || > > + oideq(&ci->stages[1].oid, &ci->stages[2].oid)) > > + continue; > > + > > + /* Also don't need content merge if base matches either side */ > > + if (ci->filemask == 7 && > > + S_ISREG(ci->stages[0].mode) && > > + (oideq(&ci->stages[0].oid, &ci->stages[1].oid) || > > + oideq(&ci->stages[0].oid, &ci->stages[2].oid))) > > + continue; > > Even though this is unlikely to change, it is unsatisfactory that we > reproduce the knowledge on the situations when a merge will > trivially resolve and when it will need to go content level. I agree, it's not the nicest. > One obvious way to solve it would be to fold this logic into the > main code that actually merges a list of "ci"s by making it a two > pass process (the first pass does essentially the same as this new > function, the second pass does the tree-level merge where the above > says "continue", fills mmfiles with the loop below, and calls into > ll_merge() after the loop to merge), but the logic duplication is > not too big and it may not be worth such a code churn. I'm worried even more about the resulting complexity than the code churn. The two-pass model, which I considered, would require special casing so many of the branches of process_entry() that it feels like it'd be increasing code complexity more than introducing a function with a few duplicated checks. process_entry() was already a function that Stolee reported as coming across as pretty complex to him in earlier rounds of review, but that seems to just be intrinsic based on the number of special cases: handling anything from entries with D/F conflicts, to different file types, to match_mask being precomputed, to recursive vs. normal cases, to modify/delete, to normalization, to added on one side, to deleted on both side, to three-way content merges. The three-way content merges are just one of 9-ish different branches, and are the only one that we're prefetching for. It just seems easier and cleaner overall to add these three checks to pick off the cases that will end up going through the three-way content merges. I've looked at it again a couple times over the past few days based on your comment, but I still can't see a way to restructure it that feels cleaner than what I've currently got. Also, it may be worth noting here that if these checks fell out of date with process_entry() in some manner, it still would not affect the correctness of the code. At worst, it'd only affect whether enough or too many objects are prefetched. If too many, then some extra objects would be downloaded, and if too few, then we'd end up later fetching additional objects 1-by-1 on demand later. So I'm going to agree with the not-worth-it portion of your final sentence and leave this out of the next roll.