Glen Choo wrote: > This patch adds another instance of copying fields from "struct > repository_format" to "struct repository", so I think that we should > start doing this with a helper function instead of copy-pasting the > logic. > > As for what should be in the helper function, the above hunks suggest > that we should copy .hash_algo, .partial_clone, and .worktree_config. > However... > ... > > This hunk does not copy .hash_algo. I initially wondered if it is safe > to just copy .hash_algo here too, but I now suspect that we shouldn't > have done the_repository setup in discover_git_directory() in the first > place. It isn't used by the setup.c machinery - its one caller in "git" > (it's used by "scalar") is read_early_config(), which is supposed to > work without a fully set up repository, and bears a comment saying that > "no global state is changed" by calling discover_git_directory() (which > stopped being true in ebaf3bcf1ae). It looks like > discover_git_directory() is just a lightweight entrypoint into the > setup.c machinery. 16ac8b8db6 (setup: introduce the > discover_git_directory() function, 2017-03-13)) says "Let's just provide > a convenient wrapper function with an easier signature that *just* > discovers the .git/ directory. We will use it in a subsequent patch to > fix the early config." > > If I'm wrong and we _should_ be doing the_repository setup, then I'm > guessing it's safe to copy .hash_algo here too. So either way, I think > we should introduce a helper function to do the copying, especially > because we will probably need to repeat this process yet again for > "repository_format_precious_objects". Thanks for pointing this out & sharing your findings! I agree with the desire to reduce code duplication, but the reason I avoided that refactor when putting these patches together is because of the subtle differences across the different repository format assignment blocks. For example, in addition to what you mentioned here w.r.t. '.hash_algo', there are also differences in how 'repository_format_partial_clone' is assigned: it's deep-copied in 'check_repository_format', but shallow-copied (then subsequently NULL'd in the 'struct repository_format' to avoid freeing the pointer when the struct is disposed of) in 'discover_git_directory()' & 'setup_git_directory_gently()'. If we were to settle on a single "copy repository format settings" function, it's not obvious what the "right" approach is. We could change 'check_repository_format()' to the shallow-copy-then-null like the others: its two callers (in 'init-db.c' and 'path.c') don't use the value of 'repository_format_partial_clone' in 'struct repository_format' after calling 'check_repository_format()'. But, if we did that, it'd introduce a side effect to the input 'struct repository_format', which IMO would be surprising behavior for a function called 'check_<something>()'. Conversely, unifying on a deep copy or adding a flag to toggle deep vs. shallow copy feels like unnecessary complexity if we don't actually need a deep copy. Beyond the smaller subtleties, there's the larger question (that you sort of get at with the questions around 'discover_git_directory()') as to whether we should more heavily refactor or consolidate these setup functions. The similar code implies "yes", but such a refactor feels firmly out-of-scope for this series. A smaller change (e.g. just moving the assignments into their own function) could be less of a diversion, but any benefit seems like it'd be outweighed by the added churn/complexity of a new function. In any case, sorry for the long-winded response. I'd initially tried to implement your feedback, but every time I did I'd get stopped up on the things I mentioned above. So, rather than continue to put off responding to this thread, I tried to capture what kept stopping me from moving forward - hopefully it makes (at least a little bit of) sense!