Re: [PATCH 1/2] pack-objects: Teach --use-bitmap-index codepath to respect --local, --honor-pack-keep and --incremental

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Kirill Smelkov <kirr@xxxxxxxxxx> writes:

> ---- 8< ----
> From: Kirill Smelkov <kirr@xxxxxxxxxx>
> Subject: [PATCH v3] pack-objects: Teach --use-bitmap-index codepath to respect
>  --local, --honor-pack-keep and --incremental

(Not a question to Kirill)

Hmph.  I suspect that handling of in-body header by mailinfo not
prepared to see RFC2822 header folding.  "am -c" gives a single line
subject with " --local ..." as its first line in the body.

I'll leave it as a low-hanging fruit for somebody to fix ;-)

	Subject: pack-objects: respect --local, etc. when bitmap is in use

might be shorter and more to the point, anyway.

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index c4c2a3c..e06c1bf 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -944,13 +944,45 @@ static int have_duplicate_entry(const unsigned char *sha1,
>  	return 1;
>  }
>  
> +static int want_found_object(int exclude, struct packed_git *p)
> +{
> +	if (exclude)
> +		return 1;
> +	if (incremental)
> +		return 0;
> +
> +	/*
> +	 * When asked to do --local (do not include an
> +	 * object that appears in a pack we borrow
> +	 * from elsewhere) or --honor-pack-keep (do not
> +	 * include an object that appears in a pack marked
> +	 * with .keep), we need to make sure no copy of this
> +	 * object come from in _any_ pack that causes us to
> +	 * omit it, and need to complete this loop.  When
> +	 * neither option is in effect, we know the object
> +	 * we just found is going to be packed, so break
> +	 * out of the search loop now.
> +	 */

The blame is mine, but "no copy of this object appears in _any_ pack"
would be more correct and easier to read.

This code is no longer in a search loop; its caller is.  Further
rephrasing is needed.  "When asked to do ...these things..., finding
a pack that matches the criteria is sufficient for us to decide to
omit it.  However, even if this pack does not satisify the criteria,
we need to make sure no copy of this object appears in _any_ pack
that makes us to omit the object, so we need to check all the packs.
Signal that by returning -1 to the caller." or something along that
line.

>  /*
>   * Check whether we want the object in the pack (e.g., we do not want
>   * objects found in non-local stores if the "--local" option was used).
>   *
> - * As a side effect of this check, we will find the packed version of this
> - * object, if any. We therefore pass out the pack information to avoid having
> - * to look it up again later.
> + * As a side effect of this check, if object's pack entry was not already found,
> + * we will find the packed version of this object, if any. We therefore pass
> + * out the pack information to avoid having to look it up again later.

The reasoning leading to "We therefore" is understandable, but "pass
out the pack information" is not quite.  Is this meant to explain
the fact that *found_pack and *found_offset are in-out parameters?

The explanation to justify why *found_pack and *found_offset that
used to be out parameters are made in-out parameters belongs to the
log message.  We do not want this in-code comment to explain the
updated code relative to what the code used to do; that is not
useful to those who read the code for the first time in the context
of the committed state.

        /* 
         * Check whether we want to pack the object in the pack (e.g. ...).
         *
         * If the caller already knows an existing pack it wants to
         * take the object from, that is passed in *found_pack and
         * *found_offset; otherwise this function finds if there is
         * any pack that has the object and returns the pack and its
         * offset in these variables.
         */

> diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> index 3893afd..e71caa4 100755
> --- a/t/t5310-pack-bitmaps.sh
> +++ b/t/t5310-pack-bitmaps.sh
> @@ -7,6 +7,19 @@ objpath () {
>  	echo ".git/objects/$(echo "$1" | sed -e 's|\(..\)|\1/|')"
>  }
>  
> +# show objects present in pack ($1 should be associated *.idx)
> +packobjects () {
> +	git show-index <$1 | cut -d' ' -f2
> +}

That is a misleading name for a helper function that produces a list
of objects that were packed.  "list_packed_objects", perhaps.

> +# hasany pattern-file content-file
> +# tests whether content-file has any entry from pattern-file with entries being
> +# whole lines.
> +hasany () {
> +	# NOTE `grep -f` is not portable
> +	git grep --no-index -qFf $1 $2
> +}

I doubt "grep -f pattern_file" is not portable, but in any case, it
is probably a good idea to have this helper function to make the
caller easier to read.  Please name it "has_any", though, and quote
"$1" and "$2" as they are meant to be able to take any filename.

> +test_expect_success 'pack-objects respects --local (non-local loose)' '
> +	mkdir -p alt_objects/pack &&

I'd really really prefer to see an empty repository created for
this.  Even though the original intent was .git/objects/ alone,
i.e. GIT_OBJECT_DIRECTORY can exist without associated refs, we
discovered that it is in general not a good idea (think: "gc").

> +	echo $(pwd)/alt_objects >.git/objects/info/alternates &&
> +	echo content1 >file1 &&
> +	# non-local loose object which is not present in bitmapped pack
> +	objsha1=$(GIT_OBJECT_DIRECTORY=alt_objects git hash-object -w file1) &&

Don't say "sha" when you mean "object name".  Otherwise you would
end up introducing funky variable names like $objsha2 we see below
that is confusing (we don't use SHA-2).

> +	# non-local loose object which is also present in bitmapped pack
> +	git cat-file blob $blob | GIT_OBJECT_DIRECTORY=alt_objects git hash-object -w --stdin &&
> +	git add file1 &&
> +	test_tick &&
> +	git commit -m commit_file1 &&
> +	echo HEAD | git pack-objects --local --stdout --revs >1.pack &&
> +	git index-pack 1.pack &&
> +	packobjects 1.idx >1.objects &&
> +	printf "$objsha1\n$blob\n" >nonlocal-loose &&

I think Peff meant to suggest this instead:

	printf "%s\n" "$objsha1" "$blob"

> +	if hasany nonlocal-loose 1.objects; then
> +		echo "Non-local object present in pack generated with --local"
> +		return 1
> +	fi

Just saying

	! has_any nonlocal-loose 1.objects

is sufficient.  Same comment for all other uses of these verbose
output.

Besides, we spell "if/then/fi" like this:

	if condition
        then
        	body
	fi

without a semicolon.

> +test_expect_success 'pack-objects respects --honor-pack-keep (local non-bitmapped pack)' '
> +...
> +	touch .git/objects/pack/pack2-$pack2.keep &&

Please don't do "touch" _unless_ you care about the timestamp of the
file.  Redirect an empty command into it, i.e.

	>.git/objects/pack/pack2-$pack2.keep

or

	echo "reason to keep it" >.git/objects/pack/pack2-$pack2.keep

instead.

> +test_expect_success 'pack-objects respects --honor-pack-keep (local bitmapped pack)' '
> +	ls .git/objects/pack/ | grep bitmap >output &&
> +	test_line_count = 1 output &&
> +	packbitmap=$(basename $(cat output) .bitmap) &&
> +	packobjects .git/objects/pack/$packbitmap.idx >packbitmap.objects &&
> +	touch .git/objects/pack/$packbitmap.keep &&
> +	echo HEAD | git pack-objects --honor-pack-keep --stdout --revs >3a.pack &&
> +	git index-pack 3a.pack &&
> +	packobjects 3a.idx >3a.objects &&
> +	if hasany packbitmap.objects 3a.objects; then
> +		echo "Object from .keeped bitmapped pack present in pack generated with --honour-pack-keep"
> +		return 1
> +	fi &&
> +	rm .git/objects/pack/$packbitmap.keep

Arrange this removal to happen even when any earlier step fails, so
that later tests will not get affected by stray existence of this
file, by using test_when_finished.  E.g.

	list_packed_objects .git/objects/pack/$packbitmap.idx >packbitmap.objects &&
	test_when_finished "rm -f .git/objects/pack/$packbitmap.keep" &&
	>.git/objects/pack/$packbitmap.keep" &&

> +test_expect_success 'pack-objects respects --local (non-local bitmapped pack)' '
> +	mv .git/objects/pack/$packbitmap.* alt_objects/pack/ &&
> +	echo HEAD | git pack-objects --local --stdout --revs >3b.pack &&
> +	git index-pack 3b.pack &&
> +	packobjects 3b.idx >3b.objects &&
> +	if hasany packbitmap.objects 3b.objects; then
> +		echo "Non-local object from bitmapped pack present in pack generated with --local"
> +		return 1
> +	fi &&
> +	mv alt_objects/pack/$packbitmap.* .git/objects/pack/

Ditto on potential use of test_when_finished.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]