Re: [PATCH] Additional git-archive tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nick Townsend <nick.townsend@xxxxxxx> writes:

>> In other words, are all these new tests expected to pass?
>
> Sorry about that - the first four tests do pass with 1.8.5, the last
> three tests do not currently pass.

OK.

>> So I do not think it is expected to accept tree object names with
>> the HEAD:<path> style syntax, if the user expects a predictable and
>> consistent result.  The third step above attempts to make sure that
>> you name a tree-ish that corresponds to the top-level of the
>> project, i.e. with no <path>.
>> 
> That’s not quite right - paths do work from the root directory - see tests.

I know that they work from the very top, but they _only_ happens to
work from the very top.  As I do not see the code is designed to
compare the prefix (i.e. the subdirectory you are in) and the path
that is tucked after the name of the top-level tree object with a
colon (e.g. path in "HEAD:path") and adjust these two accordingly, I
do not think it was designed to deal with that at all.  The code
just assumes that the tree object corresponds to the top-level, and
the adjustment is done solely by adding the prefix when limiting the
output by paths.

This observation *is* very right. Not designing to deal with that
case may or may not be right, but that is a different issue.

> It was this very useful capability that I sought to add and generalized
> the code to do. 
>> What seems to be supported are:
>> 
>>    cd a && git archive HEAD ;# archives HEAD:a tree
>>    cd a && git archive HEAD -- b ;# archives a/b/ part of HEAD:a as b/
>> 
>> Specifically, it appears that HEAD:./b, HEAD:b etc. are not designed
>> to work, at least to me.
>> ...
>> I am not saying that these should _not_ work, but it is unclear what
>> it means to "work".  For example, what should this do?
>> 
>>    cd a && git archive HEAD:./b $pathspec
>> 
> I think we can clear this up by documenting the cases and choosing
> sensible behaviour _for git-archive_ ie. what people might expect.

I am afraid that this topic is not limited to "git-archive";
"sensible behaviour _for git-archive_" will end up being far from
sufficient.  I would not be surprised if "ls-tree" shares the same
issue, for example.  If you had "subdir/subsubdir/path" (and other
paths) tracked in your project,

    cd subdir
    git ls-tree HEAD:./subsubdir path

is likely to have the same issue as

    cd subdir
    git archive HEAD:./subsubdir path

I would think.

> Here is a suggestion:
>
> a. The pathspec is used as a selector to include things in the archive.
> it is applied after the cwd and treeish selection.

That would mean that doing this in the top-level:

        git archive HEAD:subdir/subsubdir path<TAB>

will not auto-complete the pathname.  In all other Git commands,
vanilla pathspecs (at least the ones that do not use the magic
escape hatch ":/path", etc.) are relative to your current directory,
and I do not think we want a single command to work in an
inconsistent way with respect to that.

> b. The current working directory (if present) prefixes a path in the object
> if one is present.

That is already done at a lower-level get_sha1_with_context(), I
think.

	cd subdir
	git archive HEAD:./subsubdir

will try to use tree subdir/subsubdir in the HEAD commit (but due to
the prefix logic that is not preprared to use such a tree object,
you will get "current working directory is untracked" error).

> c. The path in the object (if present) is prefixed by the cwd (if present) and
> used to select items for archiving. However the composite path so created
> *is not present* in the archive - ie. the leading components are stripped.
> (This is very useful behaviour for creating archives without leading paths)
>
> d. As currently the ―prefix option (not the prefix from setup_git_directory)
>  is prepended to all entries in the archive.
>
> These correspond to the use cases that I have for git archive - extracting all
> or part of a multiple submodule tree into a tgz file, with or without leading
> directories.
>
>> The extended SHA-1 expression HEAD:./b in the subdirectory a/ is
>> interpreted by get_sha1_with_context_1() to be the name of the tree
>> object at path "a/b" in the commit HEAD.  Further, you are giving a
>> pathspec while in a subdirectory a/ to the command.  What should
>> that pathspec be relative to?
>> 
>> In a normal Git command, the pathspec always is relative to the
>> current subdirectory, so, the way to learn about the tree object
>> a/b/c in the HEAD while in subdirectory a/ would be:
>> 
>>    cd a && git ls-tree HEAD b/c
>> 
>> But what should the command line for archive to grab HEAD:a/b/c be?
>> It feels wrong to say:
>> 
>>    cd a && git archive HEAD:./b b/c
> It’s clear to me that if you are in a subdirectory, that is an implicit prefix on the 
> ./b so you retrieve HEAD:a/b  which then archives everything in b without the
> leading a/b. The pathspec is wrong (including the b) and this archive is empty. 
>> 
>> and it also feels wrong to say
>> 
>>    cd a && git archive HEAD:./b c
>> 
> That looks fine to me given the rules suggested above.
>
> Also git-parse manages
> to return the correct thing in this case, so I’d expect this to work.
>
>> No matter what we would do, we should behave consistently with this
>> case:
>> 
>>    treeish=$(git rev-parse HEAD:a/b) &&
>>    cd a &&
>>    git archive $treeish -- $pathspec
>> 
> Well this will fail both in the existing case (‘fatal: current
> working directory is untracked’) and with the scheme proposed
> above (‘fatal: invalid object name $treeish:a/‘)

The thing is the "current working directory is untracked" is a
fallout from the current archive code not expecting to see a
tree-ish specifed in HEAD:<path> style to interact with $cwd (aka
prefix).  Go back to the description in how parse_treeish_arg()
works in the message you are responding to (or to archive.c).  It
says "if we have prefix, grab the tree object that corresponds to
the prefix in the given tree-ish".  When "subdir" is your current
directory, if you give HEAD as the tree-ish, this logic will give
you the subtree HEAD:subdir; if you give anything else that does not
correspond to the top-level of the project, it will not be able to
find such a subtree (by definition, any tree-ish that does not
correspond to the top-level cannot contain "subdir" tree object that
corresponds to "subdir" directory of the top-level; it may contain a
random directory that happens to be named "subdir", but that will be
a different directory from the "subdir" you see at the top-level).

I do not think a workable compromise exists, but one that is closest
to be workable, even though I do not think it is such a great idea,
would be to apply the add "prefix" logic _only_ when the given
tree-ish is known to be at the top-level (e.g. when it was
originally given as a commit).


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]