Some more experiments: David Kastrup <dak@xxxxxxx> writes: > Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes: > >>> > > >>> > > I guess the second choice generally isn't an option, but dammit, >>> > > "git-apply" really is the better program here. >>> > >>> > Why not? git-apply works outside of a git repo ;-) >>> >>> I was more thinking that people are not necessarily willing to install git >>> just to get the "git-apply" program.. >> >> But maybe they would be willing to install git to get that wonderful >> git-apply program, and that wonderful rename-and-mode-aware >> git-diff, and the git-merge-file program, all of which can operate >> outside of a git repository. (Take that, hg!) > > Well, hmph! I just rewrote my git-diff-using script to not check > stuff into a throw-away git repository, and guess what: with real-life > use cases (diffing trees of about 500MB size), git-diff runs out of > memory (the machine probably has something like 1.5GB of virtual memory > size) when operating outside of a git repository. > > So the usefulness still seems limited, even now that the output format > of --name-status has been fixed. > > Any idea whether this is a bug, sloppy programming, or an inherent > restriction/necessity? > > Also an idea which of the following scenarios would be best for > catching all of moves/renames/deletes/adds? Note: any repository is > strictly throw-away. > > Experiments are somewhat time-consuming, so every hunch helps. > > a) diff directories outside of git (works, but fatal memory footprint > for large cases) > b) diff index against work directory fatal memory footprint > c) diff revision against work directory fatal memory footprint > d) diff revision against index does not detect copies/renames > e) diff revision against revision (works, but high disk footprint and > likely slower than alternatives) So it seems like option e) is the only feasible option. In the total numbers, git-add is by far the slowest operation, followed by git-commit. git-diff on revisions is quite fast and with moderate memory footprint. Committing itself does not seem to add much disk space: adding into the index seems to be the main disk space allocation. So while the behavior of d) appears puzzling, doing another commit before the diff is cheap, so the motivation for asking people to find out the problems with d) is low for me. Somewhat dissatisfactory that rewriting my script for using the repository-less variant of git-diff fails for seriously large use cases due to out-of-memory conditions. I suppose that's life. -- David Kastrup - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html