Re: [PATCH 3/4] midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 09 2021, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Sep 07 2021, Taylor Blau wrote:
>
>> On Wed, Sep 08, 2021 at 03:40:19AM +0200, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Tue, Sep 07 2021, Taylor Blau wrote:
>>>
>>> > +static int git_multi_pack_index_write_config(const char *var, const char *value,
>>> > +					     void *cb)
>>> > +{
>>> > +	if (!strcmp(var, "pack.writebitmaphashcache")) {
>>> > +		if (git_config_bool(var, value))
>>> > +			opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
>>> > +		else
>>> > +			opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE;
>>> > +	}
>>> > +
>>> > +	/*
>>> > +	 * No need to fall-back to 'git_default_config', since this was already
>>> > +	 * called in 'cmd_multi_pack_index()'.
>>> > +	 */
>>> > +	return 0;
>>> > +}
>>> > +
>>> >  static int cmd_multi_pack_index_write(int argc, const char **argv)
>>> >  {
>>> >  	struct option *options;
>>> > @@ -73,6 +90,10 @@ static int cmd_multi_pack_index_write(int argc, const char **argv)
>>> >  		OPT_END(),
>>> >  	};
>>> >
>>> > +	opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
>>> > +
>>> > +	git_config(git_multi_pack_index_write_config, NULL);
>>> > +
>>>
>>> Since this is a write-only config option it would seem more logical to
>>> just call git_config() once, and have a git_multip_pack_index_config,
>>> which then would fall back on git_default_config, so we iterate it once,
>>> and no need for a comment about the oddity.
>>
>> Perhaps, but I'm not crazy about each sub-command having to call
>> git_config() itself when 'write' is the only one that actually has any
>> values to read.
>>
>> FWIW, the commit-graph builtin does the same thing as is written here
>> (calling git_config() twice, once in cmd_commit_graph() with
>> git_default_config as the callback and again in cmd_commit_graph_write()
>> with git_commit_graph_write_config as the callback).
>
> I didn't notice your earlier d356d5debe5 (commit-graph: introduce
> 'commitGraph.maxNewFilters', 2020-09-17). As an aside the test added in
> that commit seems to be broken or not testing that code change at all,
> if I comment out the git_config(git_commit_graph_write_config, &opts)
> it'll pass.
>
> As a comment on this series I'd find 4/4 squashed into 3/4 easier to
> read, when I did a "git blame" and found d356d5debe5 I discovered the
> test right away, if and when this gets merged someone might do the same,
> but not find the test as easily (they'd probably then grep the config
> variable name and find it eventually...).
>
> More importantly, the same issue with the commit-graph test seems to be
> the case here, if I comment out the added config reading code it'll
> still pass, it seems to be testing something, but not that the config is
> being read.
>
>> So I'm not opposed to cleaning it up, but I'd rather be consistent with
>> the existing behavior. To be honest, I'm not at all convinced that
>> reading the config twice is a bottleneck here when compared to
>> generating a MIDX.
>
> It's never going to matter at all for performance, I should have been
> clearer with my comments. I meant them purely as a "this code is hard to
> follow" comment.
>
> I.e. since we read the config twice, and in both commit-graph.c and
> multi-pack-index.c munge and write to the "opts" struct on
> parse_options(), you'll need to follow logic like:
>
>     1. Read config in cmd_X(), might set variable xyz
>     2. Do parse_options() in cmd_X(), might set variable xyz also
>     3. Now in cmd_X_subcmd(), read config, might set variable xyz
>     4. Do parse_options() in cmd_X(), migh set variable xyz also
>
> Of course in this case the relevant opts.flags only matters for the
> "write" subcommand, so on more careful reading we don't need to worry
> about the value flip-flopping between config defaults and getopts
> settings, but just in terms of establishing a pattern we'll be following
> in the subcommand built-ins I think this is setting us up for more
> complexity than is needed.
>
> As far as being consistent with existing behavior, in git-worktree,
> git-stash which are both similarly structured subcommands we follow the
> pattern of calling git_config() once, it seems to me better to follow
> that pattern than the one in d356d5debe5 if the config can be
> unambiguously parsed in one pass.

In similar spirit as my
https://lore.kernel.org/git/87v93bidhn.fsf@xxxxxxxxxxxxxxxxxxx/ I
started seeing if not doing the flags via getopt but instead variables &
setting the flags later was better, and came up with this on top. Not
for this series, more to muse on how we can write these subcommands in a
simpler manner (or not).

I may have discovered a subtle bug in the process, in
cmd_multi_pack_index_repack() we end up calling write_midx_internal(),
which cares about MIDX_WRITE_REV_INDEX, but only
cmd_multi_pack_index_write() will set that flag, both before & after my
patch. Are we using the wrong flags during repack as a result?

diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c
index dd1652531bf..1b97b2ee4e1 100644
--- a/builtin/multi-pack-index.c
+++ b/builtin/multi-pack-index.c
@@ -45,14 +45,16 @@ static char const * const builtin_multi_pack_index_usage[] = {
 static struct opts_multi_pack_index {
 	const char *object_dir;
 	const char *preferred_pack;
-	unsigned long batch_size;
-	unsigned flags;
-} opts;
+	int progress;
+	int write_bitmap_hash_cache;
+} opts = {
+	.write_bitmap_hash_cache = -1,
+};
 
 static struct option common_opts[] = {
 	OPT_FILENAME(0, "object-dir", &opts.object_dir,
 	  N_("object directory containing set of packfile and pack-index pairs")),
-	OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS),
+	OPT_BOOL(0, "progress", &opts.progress, N_("force progress reporting")),
 	OPT_END(),
 };
 
@@ -61,38 +63,29 @@ static struct option *add_common_options(struct option *prev)
 	return parse_options_concat(common_opts, prev);
 }
 
-static int git_multi_pack_index_write_config(const char *var, const char *value,
-					     void *cb)
+static int git_multi_pack_index_config(const char *var, const char *value,
+				       void *cb)
 {
 	if (!strcmp(var, "pack.writebitmaphashcache")) {
-		if (git_config_bool(var, value))
-			opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
-		else
-			opts.flags &= ~MIDX_WRITE_BITMAP_HASH_CACHE;
+		opts.write_bitmap_hash_cache = git_config_bool(var, value);
+		return 0;
 	}
 
-	/*
-	 * No need to fall-back to 'git_default_config', since this was already
-	 * called in 'cmd_multi_pack_index()'.
-	 */
-	return 0;
+	return git_default_config(var, value, NULL);
 }
 
 static int cmd_multi_pack_index_write(int argc, const char **argv)
 {
 	struct option *options;
+	static int write_bitmap = 0;
 	static struct option builtin_multi_pack_index_write_options[] = {
 		OPT_STRING(0, "preferred-pack", &opts.preferred_pack,
 			   N_("preferred-pack"),
 			   N_("pack for reuse when computing a multi-pack bitmap")),
-		OPT_BIT(0, "bitmap", &opts.flags, N_("write multi-pack bitmap"),
-			MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX),
+		OPT_BOOL(0, "bitmap", &write_bitmap, N_("write multi-pack bitmap")),
 		OPT_END(),
 	};
-
-	opts.flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
-
-	git_config(git_multi_pack_index_write_config, NULL);
+	unsigned flags = 0;
 
 	options = add_common_options(builtin_multi_pack_index_write_options);
 
@@ -107,8 +100,15 @@ static int cmd_multi_pack_index_write(int argc, const char **argv)
 
 	FREE_AND_NULL(options);
 
-	return write_midx_file(opts.object_dir, opts.preferred_pack,
-			       opts.flags);
+	if (opts.progress)
+		flags |= MIDX_PROGRESS;
+	/* Both -1 default and 1 via config */
+	if (!opts.write_bitmap_hash_cache)
+		flags |= MIDX_WRITE_BITMAP_HASH_CACHE;
+	if (write_bitmap)
+		flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX;
+
+	return write_midx_file(opts.object_dir, opts.preferred_pack, flags);
 }
 
 static int cmd_multi_pack_index_verify(int argc, const char **argv)
@@ -124,7 +124,7 @@ static int cmd_multi_pack_index_verify(int argc, const char **argv)
 		usage_with_options(builtin_multi_pack_index_verify_usage,
 				   options);
 
-	return verify_midx_file(the_repository, opts.object_dir, opts.flags);
+	return verify_midx_file(the_repository, opts.object_dir, opts.progress);
 }
 
 static int cmd_multi_pack_index_expire(int argc, const char **argv)
@@ -140,14 +140,15 @@ static int cmd_multi_pack_index_expire(int argc, const char **argv)
 		usage_with_options(builtin_multi_pack_index_expire_usage,
 				   options);
 
-	return expire_midx_packs(the_repository, opts.object_dir, opts.flags);
+	return expire_midx_packs(the_repository, opts.object_dir, opts.progress);
 }
 
 static int cmd_multi_pack_index_repack(int argc, const char **argv)
 {
+	static unsigned long batch_size = 0;
 	struct option *options;
 	static struct option builtin_multi_pack_index_repack_options[] = {
-		OPT_MAGNITUDE(0, "batch-size", &opts.batch_size,
+		OPT_MAGNITUDE(0, "batch-size", &batch_size,
 		  N_("during repack, collect pack-files of smaller size into a batch that is larger than this size")),
 		OPT_END(),
 	};
@@ -167,7 +168,8 @@ static int cmd_multi_pack_index_repack(int argc, const char **argv)
 	FREE_AND_NULL(options);
 
 	return midx_repack(the_repository, opts.object_dir,
-			   (size_t)opts.batch_size, opts.flags);
+			   (size_t)batch_size,
+			   opts.progress ? MIDX_PROGRESS : 0);
 }
 
 int cmd_multi_pack_index(int argc, const char **argv,
@@ -175,10 +177,10 @@ int cmd_multi_pack_index(int argc, const char **argv,
 {
 	struct option *builtin_multi_pack_index_options = common_opts;
 
-	git_config(git_default_config, NULL);
+	git_config(git_multi_pack_index_config, NULL);
 
 	if (isatty(2))
-		opts.flags |= MIDX_PROGRESS;
+		opts.progress = 1;
 	argc = parse_options(argc, argv, prefix,
 			     builtin_multi_pack_index_options,
 			     builtin_multi_pack_index_usage,
diff --git a/midx.c b/midx.c
index 6c35dcd557c..3e722888d69 100644
--- a/midx.c
+++ b/midx.c
@@ -1482,7 +1482,7 @@ static int compare_pair_pos_vs_id(const void *_a, const void *_b)
 			display_progress(progress, _n); \
 	} while (0)
 
-int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags)
+int verify_midx_file(struct repository *r, const char *object_dir, int opt_progress)
 {
 	struct pair_pos_vs_id *pairs = NULL;
 	uint32_t i;
@@ -1505,7 +1505,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 	if (!midx_checksum_valid(m))
 		midx_report(_("incorrect checksum"));
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_delayed_progress(_("Looking for referenced packfiles"),
 					  m->num_packs);
 	for (i = 0; i < m->num_packs; i++) {
@@ -1534,7 +1534,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		return verify_midx_error;
 	}
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"),
 						 m->num_objects - 1);
 	for (i = 0; i < m->num_objects - 1; i++) {
@@ -1563,14 +1563,14 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag
 		pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i);
 	}
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_sparse_progress(_("Sorting objects by packfile"),
 						 m->num_objects);
 	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
 	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
 	stop_progress(&progress);
 
-	if (flags & MIDX_PROGRESS)
+	if (opt_progress)
 		progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects);
 	for (i = 0; i < m->num_objects; i++) {
 		struct object_id oid;
diff --git a/midx.h b/midx.h
index 541d9ac728d..0dfe6a54ef3 100644
--- a/midx.h
+++ b/midx.h
@@ -64,7 +64,7 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i
 
 int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
 void clear_midx_file(struct repository *r);
-int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
+int verify_midx_file(struct repository *r, const char *object_dir, int opt_progress);
 int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
 int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags);
 




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux