> Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes: > > > My idea is that this prefetch is a superset of what diffcore_rebase() > > wants to prefetch, so if we have already done the necessary logic here > > (even if nothing gets prefetched - which might be the case if we have > > all objects), we do not need to do it in diffcore_rebase(). > > s/rebase/rename/ I presume, Ah, yes. > but the above reasoning, while it may > happen to hold true right now, feels brittle. In other words > > - how do we know it would stay to be "a superset"? > > - would it change the picture if we later added a prefetch in > diffcore_break(), just like you are doing so to diffcore_rename() > in this patch? > > So the revised version of my earlier "wondering" is if it would be > more future-proof (easier to teach different steps to prefetch for > their own needs, without having to make an assumption like "what > this step needs is sufficient for the other step") to arrange the > codepath from diffcore_std() to its helpers like so: > > - prepare an empty to_fetch OID array in the caller, > > - if the output format is one of the ones that wants prefetch, > add object names to to_fetch in the caller, but not fetch as > long as the caller does not yet need the contents of the > blobs. > > - pass &to_fetch from diffcore_std() to the helper functions in > the diffcore family like diffcore_{break,rename}() have them > also batch what they (may) want to prefetch in there. Delay > fetching until they actually need to look at the blobs, and > when they fetch, clear &to_fetch for the next helper. > > - diffcore_std() also would need to look at the blob eventually, > perhaps after all the helpers it may call returns. Do the > final prefetch if to_fetch is still not empty before it has to > look at the blobs. Ah...that makes sense. Besides the accumulating of prefetch targets (which makes deduplication more necessary - I might make a "sort_and_uniq" function on oid_array that updates the oid_array in-place), looking at the code, it's not only diffcore_break() which might need prefetching, but diffcore_skip_stat_unmatch() too (not to speak of the functions that come after diffcore_rename()). The least brittle way is probably to have diff_populate_filespec() do the prefetching. I'll take a further look.