Re: [PATCH] fast-export: fix regression skipping some merge-commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Martin,

On Fri, 20 Apr 2018, Martin Ågren wrote:

> 7199203937 (object_array: add and use `object_array_pop()`, 2017-09-23)
> noted that the pattern `object = array.objects[--array.nr].item` could
> be abstracted as `object = object_array_pop(&array)`.
> 
> Unfortunately, one of the conversions was horribly wrong. Between
> grabbing the last object (i.e., peeking at it) and decreasing the object
> count, the original code would sometimes return early. The updated code
> on the other hand, will always pop the last element, then maybe do the
> early return before doing anything with the object.
> 
> The end result is that merge commits where all the parents have still
> not been exported will simply be dropped, meaning that they will be
> completely missing from the exported data.

Excellent explanation.

> Reintroduce the pattern of first grabbing the last object (using a new
> function `object_array_peek()`), then later poping it. Using
> `..._peek()` and `..._pop()` makes it clear that we are referring to the
> same item, i.e., we do not grab one element, then remove another one.

Instead of using _peek() and _pop() and having to reason about the
correctness, maybe we should simply re-push? See below for my suggested
alternative.

> My sincerest apologies for the stupid train-wreck that the original
> conversion was. Weird interactions between different components can make
> for fun bugs, but this one is just embarassing.

The only way to fail is by doing something. You did something. That is
much better than not doing anything. So please do not be sorry about
introducing a breakage. You did fix it, which makes you double awesome.

> Isaac, this should solve the problem you are seeing. Unfortunately, I do
> not have any experience with building Git for Windows [1]. I really hope
> that this bug did not take up too much of your time. Or eat your data!

It is as easy as

	git clone --depth=1 https://github.com/git-for-windows/git-sdk-64

(downloading half a gigabyte of objects, but then you have almost
everything except for the Git source and one support repository for Git
for Windows), then starting git-bash.exe in its toplevel directory and
calling

	sdk build git

in there. The `sdk` helper is in its infancy, so I could imagine that a
really neat thing would be to be able to build custom branches and bundle
them in a portable Git. Something like `sdk build portable-git --patch
https://public-inbox.org/git/20180420181248.2015922-1-martin.agren@xxxxxxxxx/`.

In the meantime, it should still be doable by calling

	sdk cd git
	sdk init build-extra
	/usr/src/build-extra/apply-from-public-inbox.sh https://public-inbox.org/git/20180420181248.2015922-1-martin.agren@xxxxxxxxx/
	make -j15 && make -j15 strip && make -j15 install
	sdk build installer

and then running that installer.

You could also build a portable Git instead by replacing the last line
with

	/usr/src/build-extra/portable/release.sh 0-test

Or if you want to avoid building a portable Git installer, and instead
copy the files directly, you could try this sequence:

	pacman -S --noconfirm rsync
	mkdir ~/my-test-git
	(cd / && rsync -Rau $(ARCH=x86_64 BITNESS=64 \
		usr/src/build-extra/make-file-list.sh) ~/my-test-git/)

Please let me know how it is going, I am always eager to make the Git for
Windows SDK easier to use, as it will ultimately save me time.

> diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh
> index 866ddf6058..2b46a83a49 100755
> --- a/t/t9350-fast-export.sh
> +++ b/t/t9350-fast-export.sh
> @@ -540,4 +540,26 @@ test_expect_success 'when using -C, do not declare copy when source of copy is a
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'todo' '
> +	test_create_repo merging &&
> +	git -C merging commit --allow-empty -m initial &&

I see that you copied the style of the latest test case, but I have to
admit that I would find it much easier to read if it said:

	(
		cd merging &&
		test_commit initial &&
		git checkout -b topic &&
		test_commit on-branch &&
		git checkout master &&
		test_commit on-master &&
		test_tick &&
		git merge --no-ff -m Yeah topic &&

		echo ":1 $(git rev-parse HEAD^^)" >marks &&
		git fast-export --import-marks=marks master >out &&
		grep Yeah out
	)

i.e. using the subshell where you cd into merging/ first thing, and then
making extensive use of `test_commit`.

> +
> +	git -C merging checkout -b topic &&
> +	>merging/topic-file &&
> +	git -C merging add topic-file &&
> +	git -C merging commit -m topic-file &&
> +
> +	git -C merging checkout master &&
> +	>merging/master-file &&
> +	git -C merging add master-file &&
> +	git -C merging commit -m master-file &&
> +
> +	git -C merging merge --no-ff topic -m "merge the topic" &&
> +
> +	oid=$(git -C merging rev-parse HEAD^^) &&
> +	echo :1 $oid >merging/git-marks &&
> +	git -C merging fast-export --import-marks=git-marks refs/heads/master >out &&
> +	grep "merge the topic" out
> +'
> +
>  test_done
> diff --git a/object.h b/object.h
> index f13f85b2a9..4d8ce280d9 100644
> --- a/object.h
> +++ b/object.h
> @@ -129,6 +129,15 @@ void add_object_array_with_path(struct object *obj, const char *name, struct obj
>   */
>  struct object *object_array_pop(struct object_array *array);
>  
> +/*
> + * Returns NULL if the array is empty. Otherwise, returns the last object.
> + * That is, the returned value is what `object_array_pop()` would have returned.
> + */
> +inline struct object *object_array_peek(const struct object_array *array)

I looked, and this would be the first use of `inline` without `static`...

> +{
> +	return array->nr ? array->objects[array->nr - 1].item : NULL;
> +}
> +
>  typedef int (*object_array_each_func_t)(struct object_array_entry *, void *);
>  
>  /*
> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 27b2cc138e..8377d27b46 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -650,9 +650,10 @@ static void handle_tail(struct object_array *commits, struct rev_info *revs,
>  {
>  	struct commit *commit;
>  	while (commits->nr) {
> -		commit = (struct commit *)object_array_pop(commits);
> +		commit = (struct commit *)object_array_peek(commits);
>  		if (has_unshown_parent(commit))
>  			return;
> +		(void)object_array_pop(commits);
>  		handle_commit(commit, revs, paths_of_changed_objects);
>  	}
>  }

As I stated above, I think we can make this a bit easier to reason about
(and less easy to break by future additions) if we avoided the _peek()
function altogether, like this:

 {
 	struct commit *commit;
 	while (commits->nr) {
 		commit = (struct commit *)object_array_pop(commits);
-		if (has_unshown_parent(commit))
+		if (has_unshown_parent(commit)) {
+			/* Queue again, to be handled later */
+			add_object_array(commits, NULL, commit);
 			return;
+		}
 		handle_commit(commit, revs, paths_of_changed_objects);
 	}
 }

(I did not test this, and I was honestly surprised that there is no
object_array_push() counterpart to _pop() ;-) So this might be all wrong.)

Ciao,
Johannes

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux