Nick Townsend <nick.townsend@xxxxxxx> writes: >> In other words, are all these new tests expected to pass? > > Sorry about that - the first four tests do pass with 1.8.5, the last > three tests do not currently pass. OK. >> So I do not think it is expected to accept tree object names with >> the HEAD:<path> style syntax, if the user expects a predictable and >> consistent result. The third step above attempts to make sure that >> you name a tree-ish that corresponds to the top-level of the >> project, i.e. with no <path>. >> > That’s not quite right - paths do work from the root directory - see tests. I know that they work from the very top, but they _only_ happens to work from the very top. As I do not see the code is designed to compare the prefix (i.e. the subdirectory you are in) and the path that is tucked after the name of the top-level tree object with a colon (e.g. path in "HEAD:path") and adjust these two accordingly, I do not think it was designed to deal with that at all. The code just assumes that the tree object corresponds to the top-level, and the adjustment is done solely by adding the prefix when limiting the output by paths. This observation *is* very right. Not designing to deal with that case may or may not be right, but that is a different issue. > It was this very useful capability that I sought to add and generalized > the code to do. >> What seems to be supported are: >> >> cd a && git archive HEAD ;# archives HEAD:a tree >> cd a && git archive HEAD -- b ;# archives a/b/ part of HEAD:a as b/ >> >> Specifically, it appears that HEAD:./b, HEAD:b etc. are not designed >> to work, at least to me. >> ... >> I am not saying that these should _not_ work, but it is unclear what >> it means to "work". For example, what should this do? >> >> cd a && git archive HEAD:./b $pathspec >> > I think we can clear this up by documenting the cases and choosing > sensible behaviour _for git-archive_ ie. what people might expect. I am afraid that this topic is not limited to "git-archive"; "sensible behaviour _for git-archive_" will end up being far from sufficient. I would not be surprised if "ls-tree" shares the same issue, for example. If you had "subdir/subsubdir/path" (and other paths) tracked in your project, cd subdir git ls-tree HEAD:./subsubdir path is likely to have the same issue as cd subdir git archive HEAD:./subsubdir path I would think. > Here is a suggestion: > > a. The pathspec is used as a selector to include things in the archive. > it is applied after the cwd and treeish selection. That would mean that doing this in the top-level: git archive HEAD:subdir/subsubdir path<TAB> will not auto-complete the pathname. In all other Git commands, vanilla pathspecs (at least the ones that do not use the magic escape hatch ":/path", etc.) are relative to your current directory, and I do not think we want a single command to work in an inconsistent way with respect to that. > b. The current working directory (if present) prefixes a path in the object > if one is present. That is already done at a lower-level get_sha1_with_context(), I think. cd subdir git archive HEAD:./subsubdir will try to use tree subdir/subsubdir in the HEAD commit (but due to the prefix logic that is not preprared to use such a tree object, you will get "current working directory is untracked" error). > c. The path in the object (if present) is prefixed by the cwd (if present) and > used to select items for archiving. However the composite path so created > *is not present* in the archive - ie. the leading components are stripped. > (This is very useful behaviour for creating archives without leading paths) > > d. As currently the ―prefix option (not the prefix from setup_git_directory) > is prepended to all entries in the archive. > > These correspond to the use cases that I have for git archive - extracting all > or part of a multiple submodule tree into a tgz file, with or without leading > directories. > >> The extended SHA-1 expression HEAD:./b in the subdirectory a/ is >> interpreted by get_sha1_with_context_1() to be the name of the tree >> object at path "a/b" in the commit HEAD. Further, you are giving a >> pathspec while in a subdirectory a/ to the command. What should >> that pathspec be relative to? >> >> In a normal Git command, the pathspec always is relative to the >> current subdirectory, so, the way to learn about the tree object >> a/b/c in the HEAD while in subdirectory a/ would be: >> >> cd a && git ls-tree HEAD b/c >> >> But what should the command line for archive to grab HEAD:a/b/c be? >> It feels wrong to say: >> >> cd a && git archive HEAD:./b b/c > It’s clear to me that if you are in a subdirectory, that is an implicit prefix on the > ./b so you retrieve HEAD:a/b which then archives everything in b without the > leading a/b. The pathspec is wrong (including the b) and this archive is empty. >> >> and it also feels wrong to say >> >> cd a && git archive HEAD:./b c >> > That looks fine to me given the rules suggested above. > > Also git-parse manages > to return the correct thing in this case, so I’d expect this to work. > >> No matter what we would do, we should behave consistently with this >> case: >> >> treeish=$(git rev-parse HEAD:a/b) && >> cd a && >> git archive $treeish -- $pathspec >> > Well this will fail both in the existing case (‘fatal: current > working directory is untracked’) and with the scheme proposed > above (‘fatal: invalid object name $treeish:a/‘) The thing is the "current working directory is untracked" is a fallout from the current archive code not expecting to see a tree-ish specifed in HEAD:<path> style to interact with $cwd (aka prefix). Go back to the description in how parse_treeish_arg() works in the message you are responding to (or to archive.c). It says "if we have prefix, grab the tree object that corresponds to the prefix in the given tree-ish". When "subdir" is your current directory, if you give HEAD as the tree-ish, this logic will give you the subtree HEAD:subdir; if you give anything else that does not correspond to the top-level of the project, it will not be able to find such a subtree (by definition, any tree-ish that does not correspond to the top-level cannot contain "subdir" tree object that corresponds to "subdir" directory of the top-level; it may contain a random directory that happens to be named "subdir", but that will be a different directory from the "subdir" you see at the top-level). I do not think a workable compromise exists, but one that is closest to be workable, even though I do not think it is such a great idea, would be to apply the add "prefix" logic _only_ when the given tree-ish is known to be at the top-level (e.g. when it was originally given as a commit). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html