Re: Bug in git archive + .gitattributes + relative path

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 06.03.23 um 17:56 schrieb Junio C Hamano:
> René Scharfe <l.s.r@xxxxxx> writes:
>
>>    $ git archive --strip-components=1 HEAD sha1dc | tar tf -
>>    .gitattributes
>>    LICENSE.txt
>>    sha1.c
>>    sha1.h
>>    ubc_check.c
>>    ubc_check.h
>
> What should happen to paths that match the given pathspec that do
> not have enough number of components?  E.g. "cache.h" when the
> command is "git archive --strip-components=1 HEAD \*.h"?  Should it
> be documented?

Entries whose full path is stripped away don't make it into the archive.
That behavior is copied from bsdtar along with the option name and most
of its description in git-archive.txt.

Alternatively we could warn or die.  The latter would be a bit awkward
because we'd either have to check all paths first or risk reporting them
after writing at least some headers.

No strong preference, but following the precedence set by bsdtar makes
the most sense to me.

>> The new option does not affect the paths of entries added by --add-file
>> and --add-virtual-file because they are handcrafted to their desired
>> values already.  Similarly, the value of --prefix is not subject to
>> component stripping.
>
> Very sensible.
>
>> diff --git a/archive.c b/archive.c
>> index 9aeaf2bd87..8308d4d9c4 100644
>> --- a/archive.c
>> +++ b/archive.c
>> @@ -166,6 +166,18 @@ static int write_archive_entry(const struct object_id *oid, const char *base,
>>  		args->convert = check_attr_export_subst(check);
>>  	}
>
> We probably could save attribute lookup overhead by moving the new
> logic a bit higher in the function?
>
> No, that would invalidate the path_without_prefix variable by using
> strbuf_remove() on &path, and will break the attribute look-up.  The
> variable is used only once before this point and never used later,
> but as an independent future-proofing, we may want to remove the
> variable or narrow the scope.  It's totally out of scope of the
> patch, though.

Would you have noticed that attribute lookup breakage without the
presence of that variable? :)

The sad thing is that we concatenate base and filename here and
then attr.c::collect_some_attrs() goes and splits them again.  It
also uses the concatenated path, but perhaps that can be avoided?

>> +	if (args->strip_components > 0) {
>> +		size_t orig_baselen = baselen;
>> +		for (int i = 0; i < args->strip_components; i++) {
>> +			const char *slash = memchr(base, '/', baselen);
>> +			if (!slash)
>> +				return S_ISDIR(mode) ? READ_TREE_RECURSIVE : 0;
>> +			baselen -= slash - base + 1;
>> +			base = slash + 1;
>> +		}
>> +		strbuf_remove(&path, args->baselen, orig_baselen - baselen);
>> +	}
>
> Nice to see that the core logic of the new feature is surprisingly
> small.
>
>>  	if (args->verbose)
>>  		fprintf(stderr, "%.*s\n", (int)path.len, path.buf);
>
> By having the verbose output after the path stripping, we won't show
> the leading components we stripped, making it similar to what we
> would see when we piped the resulting archive to "| tar tf -".  I
> guess this makes sense than showing the original path.

Right, printing the path as it appears in the archive makes sense.
bsdtar does the same..

René




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux