On Wed, Mar 23 2022, Taylor Blau wrote: > On Wed, Mar 23, 2022 at 03:22:13PM -0400, Derrick Stolee wrote: >> On 3/23/2022 2:03 PM, Josh Steadmon wrote: >> > prepare_repo_settings() initializes a `struct repository` with various >> > default config options and settings read from a repository-local config >> > file. In 44c7e62 (2021-12-06, repo-settings:prepare_repo_settings only >> > in git repos), prepare_repo_settings was changed to issue a BUG() if it >> > is called by a process whose CWD is not a Git repository. This approach >> > was suggested in [1]. >> > >> > This breaks fuzz-commit-graph, which attempts to parse arbitrary >> > fuzzing-engine-provided bytes as a commit graph file. >> > commit-graph.c:parse_commit_graph() calls prepare_repo_settings(), but >> > since we run the fuzz tests without a valid repository, we are hitting >> > the BUG() from 44c7e62 for every test case. >> > >> > Fix this by refactoring prepare_repo_settings() such that it sets >> > default options unconditionally; if its process is in a Git repository, >> > it will also load settings from the local config. This eliminates the >> > need for a BUG() when not in a repository. >> >> I think you have the right idea and this can work. > > Hmmm. To me this feels like bending over backwards in > `prepare_repo_settings()` to accommodate one particular caller. I'm not > necessarily opposed to it, but it does feel strange to make > `prepare_repo_settings()` a noop here, since I would expect that any > callers who do want to call `prepare_repo_settings()` are likely > convinced that they are inside of a repository, and it probably should > be a BUG() if they aren't. I think adding that BUG() was overzelous in the first place, per https://lore.kernel.org/git/211207.86r1apow9f.gmgdl@xxxxxxxxxxxxxxxxxxx/; I don't see what purpose it solves to be this overly anal in this code, and 44c7e62e51e (repo-settings: prepare_repo_settings only in git repos, 2021-12-06) just discusses "what" and not "why". I think a perfectly fine solution to this is just to revert it: diff --git a/repo-settings.c b/repo-settings.c index b4fbd16cdcc..e162c1479bf 100644 --- a/repo-settings.c +++ b/repo-settings.c @@ -18,7 +18,7 @@ void prepare_repo_settings(struct repository *r) int manyfiles; if (!r->gitdir) - BUG("Cannot add settings for uninitialized repository"); + return; if (r->settings.initialized++) return; I have that in my local integration branch, because I ended up wanting to add prepare_repo_settings() to usage.c, which may or may not run inside a repo (and maybe we'll have that config, maybe not). But really, in common-main.c we do a initialize_the_repository(), so a "struct repository" is already a thing we have before we get to the "RUN_SETUP_GENTLY" or whatever in git.c, and a bunch of things all over the place assume that it's the equivalent of { 0 }-initialized. If we actually want to turn repository.[ch] into some strict API where "Tho Shalt Not Use the_repository unless" we're actually in a repo surely we should have it be NULL then, and to add that BUG() to the likes of initialize_the_repository(). Except I think there's no point in that, and it would just lead to pointless churn, so why do it for the settings in particular? Why can't they just be { 0 }-init'd too? If some caller cares about the distinction between r->settings being like it is because of us actually having a repo, or us using the defaults why can't they just check r->gitdir themselves? For the rest the default of "just provide the defaults then" is a much saner API. I think *maybe* what this actually wanted to do was to bridge the gap between "startup_info->have_repository" and a caller in builtin/ calling prepare_repo_settings(), i.e. that it was a logic error to have that RUN_SETUP_GENTLY caller do that. I can see how that *might* be useful as some sanity assertion, but then maybe we could add a more narrow BUG() just for that case, even having a builtin_prepare_repo_settings() wrapper in builtin.h or whatever.