On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote: > On Tue, Sep 07 2021, Taylor Blau wrote: > >> On Wed, Sep 08, 2021 at 03:40:19AM +0200, Ævar Arnfjörð Bjarmason wrote: >>> >>> On Tue, Sep 07 2021, Taylor Blau wrote: >>> >>> > +static int git_multi_pack_index_write_config(const char *var, const char *value, >>> > + void *cb) >>> > +{ >>> > + if (!strcmp(var, "pack.writebitmaphashcache")) { >>> > + if (git_config_bool(var, value)) >>> > + opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; >>> > + else >>> > + opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE; >>> > + } >>> > + >>> > + /* >>> > + * No need to fall-back to 'git_default_config', since this was already >>> > + * called in 'cmd_multi_pack_index()'. >>> > + */ >>> > + return 0; >>> > +} >>> > + >>> > static int cmd_multi_pack_index_write(int argc, const char **argv) >>> > { >>> > struct option *options; >>> > @@ -73,6 +90,10 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) >>> > OPT_END(), >>> > }; >>> > >>> > + opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; >>> > + >>> > + git_config(git_multi_pack_index_write_config, NULL); >>> > + >>> >>> Since this is a write-only config option it would seem more logical to >>> just call git_config() once, and have a git_multip_pack_index_config, >>> which then would fall back on git_default_config, so we iterate it once, >>> and no need for a comment about the oddity. >> >> Perhaps, but I'm not crazy about each sub-command having to call >> git_config() itself when 'write' is the only one that actually has any >> values to read. >> >> FWIW, the commit-graph builtin does the same thing as is written here >> (calling git_config() twice, once in cmd_commit_graph() with >> git_default_config as the callback and again in cmd_commit_graph_write() >> with git_commit_graph_write_config as the callback). > > I didn't notice your earlier d356d5debe5 (commit-graph: introduce > 'commitGraph.maxNewFilters', 2020-09-17). As an aside the test added in > that commit seems to be broken or not testing that code change at all, > if I comment out the git_config(git_commit_graph_write_config, &opts) > it'll pass. > > As a comment on this series I'd find 4/4 squashed into 3/4 easier to > read, when I did a "git blame" and found d356d5debe5 I discovered the > test right away, if and when this gets merged someone might do the same, > but not find the test as easily (they'd probably then grep the config > variable name and find it eventually...). > > More importantly, the same issue with the commit-graph test seems to be > the case here, if I comment out the added config reading code it'll > still pass, it seems to be testing something, but not that the config is > being read. > >> So I'm not opposed to cleaning it up, but I'd rather be consistent with >> the existing behavior. To be honest, I'm not at all convinced that >> reading the config twice is a bottleneck here when compared to >> generating a MIDX. > > It's never going to matter at all for performance, I should have been > clearer with my comments. I meant them purely as a "this code is hard to > follow" comment. > > I.e. since we read the config twice, and in both commit-graph.c and > multi-pack-index.c munge and write to the "opts" struct on > parse_options(), you'll need to follow logic like: > > 1. Read config in cmd_X(), might set variable xyz > 2. Do parse_options() in cmd_X(), might set variable xyz also > 3. Now in cmd_X_subcmd(), read config, might set variable xyz > 4. Do parse_options() in cmd_X(), migh set variable xyz also > > Of course in this case the relevant opts.flags only matters for the > "write" subcommand, so on more careful reading we don't need to worry > about the value flip-flopping between config defaults and getopts > settings, but just in terms of establishing a pattern we'll be following > in the subcommand built-ins I think this is setting us up for more > complexity than is needed. > > As far as being consistent with existing behavior, in git-worktree, > git-stash which are both similarly structured subcommands we follow the > pattern of calling git_config() once, it seems to me better to follow > that pattern than the one in d356d5debe5 if the config can be > unambiguously parsed in one pass. In similar spirit as my https://lore.kernel.org/git/87v93bidhn.fsf@xxxxxxxxxxxxxxxxxxx/ I started seeing if not doing the flags via getopt but instead variables & setting the flags later was better, and came up with this on top. Not for this series, more to muse on how we can write these subcommands in a simpler manner (or not). I may have discovered a subtle bug in the process, in cmd_multi_pack_index_repack() we end up calling write_midx_internal(), which cares about MIDX_WRITE_REV_INDEX, but only cmd_multi_pack_index_write() will set that flag, both before & after my patch. Are we using the wrong flags during repack as a result? diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index dd1652531bf..1b97b2ee4e1 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -45,14 +45,16 @@ static char const * const builtin_multi_pack_index_usage[] = { static struct opts_multi_pack_index { const char *object_dir; const char *preferred_pack; - unsigned long batch_size; - unsigned flags; -} opts; + int progress; + int write_bitmap_hash_cache; +} opts = { + .write_bitmap_hash_cache = -1, +}; static struct option common_opts[] = { OPT_FILENAME(0, "object-dir", &opts.object_dir, N_("object directory containing set of packfile and pack-index pairs")), - OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")), OPT_END(), }; @@ -61,38 +63,29 @@ static struct option *add_common_options(struct option *prev) return parse_options_concat(common_opts, prev); } -static int git_multi_pack_index_write_config(const char *var, const char *value, - void *cb) +static int git_multi_pack_index_config(const char *var, const char *value, + void *cb) { if (!strcmp(var, "pack.writebitmaphashcache")) { - if (git_config_bool(var, value)) - opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; - else - opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE; + opts.write_bitmap_hash_cache = git_config_bool(var, value); + return 0; } - /* - * No need to fall-back to 'git_default_config', since this was already - * called in 'cmd_multi_pack_index()'. - */ - return 0; + return git_default_config(var, value, NULL); } static int cmd_multi_pack_index_write(int argc, const char **argv) { struct option *options; + static int write_bitmap = 0; static struct option builtin_multi_pack_index_write_options[] = { OPT_STRING(0, "preferred-pack", &opts.preferred_pack, N_("preferred-pack"), N_("pack for reuse when computing a multi-pack bitmap")), - OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"), - MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), + OPT_BOOL(0, "bitmap", &write_bitmap, N_("write multi-pack bitmap")), OPT_END(), }; - - opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE; - - git_config(git_multi_pack_index_write_config, NULL); + unsigned flags = 0; options = add_common_options(builtin_multi_pack_index_write_options); @@ -107,8 +100,15 @@ static int cmd_multi_pack_index_write(int argc, const char **argv) FREE_AND_NULL(options); - return write_midx_file(opts.object_dir, opts.preferred_pack, - opts.flags); + if (opts.progress) + flags |= MIDX_PROGRESS; + /* Both -1 default and 1 via config */ + if (!opts.write_bitmap_hash_cache) + flags |= MIDX_WRITE_BITMAP_HASH_CACHE; + if (write_bitmap) + flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX; + + return write_midx_file(opts.object_dir, opts.preferred_pack, flags); } static int cmd_multi_pack_index_verify(int argc, const char **argv) @@ -124,7 +124,7 @@ static int cmd_multi_pack_index_verify(int argc, const char **argv) usage_with_options(builtin_multi_pack_index_verify_usage, options); - return verify_midx_file(the_repository, opts.object_dir, opts.flags); + return verify_midx_file(the_repository, opts.object_dir, opts.progress); } static int cmd_multi_pack_index_expire(int argc, const char **argv) @@ -140,14 +140,15 @@ static int cmd_multi_pack_index_expire(int argc, const char **argv) usage_with_options(builtin_multi_pack_index_expire_usage, options); - return expire_midx_packs(the_repository, opts.object_dir, opts.flags); + return expire_midx_packs(the_repository, opts.object_dir, opts.progress); } static int cmd_multi_pack_index_repack(int argc, const char **argv) { + static unsigned long batch_size = 0; struct option *options; static struct option builtin_multi_pack_index_repack_options[] = { - OPT_MAGNITUDE(0, "batch-size", &opts.batch_size, + OPT_MAGNITUDE(0, "batch-size", &batch_size, N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")), OPT_END(), }; @@ -167,7 +168,8 @@ static int cmd_multi_pack_index_repack(int argc, const char **argv) FREE_AND_NULL(options); return midx_repack(the_repository, opts.object_dir, - (size_t)opts.batch_size, opts.flags); + (size_t)batch_size, + opts.progress ? MIDX_PROGRESS : 0); } int cmd_multi_pack_index(int argc, const char **argv, @@ -175,10 +177,10 @@ int cmd_multi_pack_index(int argc, const char **argv, { struct option *builtin_multi_pack_index_options = common_opts; - git_config(git_default_config, NULL); + git_config(git_multi_pack_index_config, NULL); if (isatty(2)) - opts.flags |= MIDX_PROGRESS; + opts.progress = 1; argc = parse_options(argc, argv, prefix, builtin_multi_pack_index_options, builtin_multi_pack_index_usage, diff --git a/midx.c b/midx.c index 6c35dcd557c..3e722888d69 100644 --- a/midx.c +++ b/midx.c @@ -1482,7 +1482,7 @@ static int compare_pair_pos_vs_id(const void *_a, const void *_b) display_progress(progress, _n); \ } while (0) -int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags) +int verify_midx_file(struct repository *r, const char *object_dir, int opt_progress) { struct pair_pos_vs_id *pairs = NULL; uint32_t i; @@ -1505,7 +1505,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag if (!midx_checksum_valid(m)) midx_report(_("incorrect checksum")); - if (flags & MIDX_PROGRESS) + if (opt_progress) progress = start_delayed_progress(_("Looking for referenced packfiles"), m->num_packs); for (i = 0; i < m->num_packs; i++) { @@ -1534,7 +1534,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag return verify_midx_error; } - if (flags & MIDX_PROGRESS) + if (opt_progress) progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"), m->num_objects - 1); for (i = 0; i < m->num_objects - 1; i++) { @@ -1563,14 +1563,14 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i); } - if (flags & MIDX_PROGRESS) + if (opt_progress) progress = start_sparse_progress(_("Sorting objects by packfile"), m->num_objects); display_progress(progress, 0); /* TODO: Measure QSORT() progress */ QSORT(pairs, m->num_objects, compare_pair_pos_vs_id); stop_progress(&progress); - if (flags & MIDX_PROGRESS) + if (opt_progress) progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects); for (i = 0; i < m->num_objects; i++) { struct object_id oid; diff --git a/midx.h b/midx.h index 541d9ac728d..0dfe6a54ef3 100644 --- a/midx.h +++ b/midx.h @@ -64,7 +64,7 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags); void clear_midx_file(struct repository *r); -int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags); +int verify_midx_file(struct repository *r, const char *object_dir, int opt_progress); int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags); int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags);