On Fri, Dec 16 2022, Derrick Stolee via GitGitGadget wrote: Thanks for the update! > Updates since v3 > ================ > > * Patch 1 uses an "int" instead of "unsigned int". This was a carry-over > from Kevin's patch in microsoft/git, but our use of it in patch 2 is > different and thus is better with a signed int. > * Patch 2 uses a fallback to the_repository if istate->repo is NULL. This > allows the setting to be applied to more cases where istate->repo is not > set. > * Patch 2 also uses 'start' in more places, since it already computed the > beginning of the hash. > * Patch 4 slightly modifies its use of prepare_repo_settings() due to Patch > 2's fallback to the_repository. It's good that it's "int" now instead of "unsigned int", but just doing that search-replacement I think misses the implied (which I really should have made more explicit) question in my v3 feedback (https://lore.kernel.org/git/221215.861qp03agm.gmgdl@xxxxxxxxxxxxxxxxxxx/): What should we do about the -1 state then? With your 2/4 here we'll accept e.g.: ./git -c index.skipHash=12345 status As well as: ./git -c index.skipHash=blahblah status But with the migration to repo-settings.c we start refusing the latter of those, as you inherit a stricture in repo-settings.c. Aside: I think this series would be much more readable if it were just squashed into one patch. 1/4 introduces code, but no way to reach it, with 2/4 we have config, 3/4 adds a test 4/4 tweaks how to read/parse/interpret/combine the config. But as it is split up the individual steps should make sense. The 2/4 here should really just use "bool", not "maybe_bool" to begin with, no? And part of this in 4/4 is inheriting a non-stricture in repo-settings.c, but for this new config variable that we're introducing as only a boolean from day one can't we just die() on anything that's not a boolean? On other implied feedback, in https://lore.kernel.org/git/221215.865yec3b1j.gmgdl@xxxxxxxxxxxxxxxxxxx/ I noted the switching between istate->repo & the_repository, and that you could hit a BUG() (when uncommenting a line in my testing patch on top) if we didn't have istate->repo: There was a commit message update for that: > 2: 00738c81a12 ! 2: aae047cbc9f read-cache: add index.skipHash config option > @@ Commit message > > [1] https://github.com/microsoft/git/commit/21fed2d91410f45d85279467f21d717a2db45201 > > + We load this config from the repository config given by istate->repo, > + with a fallback to the_repository if it is not set. But I think that really sweeps a potential issue under the rug. What are these cases where we don't get istate->repo, are those expected? Is preferring istate->repo always known to be redundant now, and just done for good measure, or are there subtle cases (e.g. when reading submodules) where we pick the wrong repo's config? In that context I'd really like to see some testing of submodules in the added tests, i.e. does this act as we'd like it to when you set the config as "true" in the parent, but "false" in the submodule, & the other way around? That's a case that should stress this "the_repository" v.s. "istate->repo". > While older Git versions will not recognize the null hash as a special > case, the file format itself is still being met in terms of its > structure. Using this null hash will still allow Git operations to > @@ read-cache.c: static int verify_hdr(const struct cache_header *hdr, unsigned lon > the_hash_algo->update_fn(&c, hdr, size - the_hash_algo->rawsz); > the_hash_algo->final_fn(hash, &c); > - if (!hasheq(hash, (unsigned char *)hdr + size - the_hash_algo->rawsz)) > -+ if (!hasheq(hash, end - the_hash_algo->rawsz)) > ++ if (!hasheq(hash, start)) > return error(_("bad index file sha1 signature")); > return 0; Thanks, good to have this suggested simplification. > @@ read-cache.c: static int do_write_index(struct index_state *istate, struct tempfile *tempfile, > + int ieot_entries = 1; > + struct index_entry_offset_table *ieot = NULL; > + int nr, nr_threads; > ++ struct repository *r = istate->repo ? istate->repo : the_repository; > > f = hashfd(tempfile->fd, tempfile->filename.buf); > > -+ git_config_get_maybe_bool("index.skiphash", (int *)&f->skip_hash); > ++ repo_config_get_maybe_bool(r, "index.skiphash", &f->skip_hash); > + > for (i = removed = extended = 0; i < entries; i++) { Re the above this looks good, but did we introduce an untested behavior change here or not? > if (cache[i]->ce_flags & CE_REMOVE) > 3: 86370af1351 = 3: 27fbe52e748 test-lib-functions: add helper for trailing hash > 4: 6490bd445eb ! 4: 075921514f2 features: feature.manyFiles implies fast index writes > @@ read-cache.c: static int do_write_index(struct index_state *istate, struct tempf > > f = hashfd(tempfile->fd, tempfile->filename.buf); > > -- git_config_get_maybe_bool("index.skiphash", (int *)&f->skip_hash); > -+ if (istate->repo) { > -+ prepare_repo_settings(istate->repo); > -+ f->skip_hash = istate->repo->settings.index_skip_hash; > -+ } > +- repo_config_get_maybe_bool(r, "index.skiphash", &f->skip_hash); > ++ prepare_repo_settings(r); > ++ f->skip_hash = r->settings.index_skip_hash; > > for (i = removed = extended = 0; i < entries; i++) { > if (cache[i]->ce_flags & CE_REMOVE) I really don't care per-se where we read the config, as long as it's doing what we expect from the UX point of view. But in your https://lore.kernel.org/git/9e754af8-ecd3-6aed-60e8-2fc09a6a8759@xxxxxxxxxx/ you noted "There's no reason to load the config inside repo-settings.c unless it's part of something like feature.manyFiles.". I think that's true, but on the flip side of that: Is there a reason to move the reading of such localized config (only needed in read-cache.c, as 2/4 shows) to repo-settings.c, just because it's now relying on the "manyfiles"? I think the unstated reason for that is that while we read the "manyfiles" config we didn't stick it in the "struct repo_settings", therefore the reading of it is conveniently done in repo-settings.c, which the 4/4 here does. But I think it makes more sense to just stick the "manyfiles" in that struct, not stick "skip_hash" there, and then just check the "manyfiles" in 4/4, but read "index.skiphash" locally. Maybe not worth the churn at this point, but I do wonder if we're going in the wrong direction here, i.e. if all config that happens to rely on features must be added to repo-settings.c, or whether we should reserve that for truly "global" (or mostly global) config.