Jeff King <peff@xxxxxxxx> writes: > From a cursory look, these definitely go in the right direction. I like > how the third one was able to rip populate_from_stdin entirely out of > diff.c. > > I suspect we could get by without even the nongit_stdin flag you added; > the only place which uses it is diff_fill_sha1_info, but theoretically > we don't even need it there; we could just index_mem the file contents > we get via diff_populate_filespec, and the stdin contents will > already be there. Probably that is a sane thing to do. And as you hinted, we definitely need to give the "the copy in the filespec->data is the only one; do not discard until we are really done with it" semantics to the bit, or introduce a separate bit that is not necessarily related to the "this came from the standard input" bit. I offhand do not know what data source we would later add that needs such a treatment. "git diff http://example.com/{foo,bar}", perhaps? The "do not discard" bit at that point may need to learn to swap it out to a temporary file or something when it happens. > Right now we call index_path for worktree files, but I don't really see > much point; we have to read the whole data either way, and > populate_filespec should be mmap-ing them for us. > >> - We say on the "diff --git" header uglyness like "a/-", "b/-"; >> likewise in the metainfo; > > I'd consider changing the path to "/dev/stdin" for this case. It doesn't > exist on some platforms, of course, but neither does /dev/null, which we > use similarly. Sensible. >> - We show on the "index" header "0*" value for these entries, even >> though we should be able to compute it (after all we do so for >> files on disk in a non-git directory); > > The index_mem I mentioned above would fix that. Yes. >> - We still apply attributes and textconv as if we are dealing with >> a regular file "-" at the root level. > > I think that's bad. I wonder if it should have "*" attributes applied to > it or not. While I can see it being convenient in some cases, I think it > makes the rules confusingly complex. It is likely that the crlf conversion on Dos systems wants to be applied regardless. This is unrelated to the "standard input confusion" issue, but there are two more things coming either from the way the no-index code is done or from the way it is bolted onto git. With the current code, this: mkdir foo bar echo hello >foo/1 echo bye >bar/2 git diff --no-index foo bar shows differences between a/foo/1 and b/bar/1, as the no-index code records foo/1 and bar/1 as the paths in the filespec for them. But if you are comparing two directory hierarchies, it is a lot more likely that you would want to see corresponding files in these two directories. In other words, the patch is better shown as differences between a/1 and b/1 (the code makes the core compare foo/1 and bar/2 after all). This of course requires no-index to differentiate the logical pathname (i.e. "this is the path inside collection a/ (or b/)") and the physical location from which the contents are read. Such a differentiation would allow us to also do renames and rename classifications much more sanely. We had to add DIFF_PAIR_RENAME() and filepair->renamed_pair only to support this codepath because of this misdesign. We could have just run strcmp() between the logical pathname of one/two members of the filepair to see if the pair was renamed if it was done that way. And the way diff-no-index.c::queue_diff() walks two directories to grab paths from them in parallel and incrementally means that the filesystem walking code cannot be reused for something like this: git diff master:Documentation /var/tmp/docs to compare a hierarchy represented with a tree object with another hierarchy stored in the filesystem outside git's control. A natural way to do so would be to grab a set of paths from /var/tmp/docs and have that set compared against the other set that comes from the tree, and the "grab a set of paths from /var/tmp/docs" machinery can be used twice to implement the current git diff --no-index /var/tmp/foo /var/tmp/bar which would have been a lot cleaner implementation. Oh well. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html