Re: What's the difference between `git show branch:file | diff -u - file` vs `git diff branch file`?

Marat Radchenko <marat@xxxxxxxxxxxxxxxx> · Thu, 25 Aug 2011 20:09:03 +0400

On 08/24/2011 00:07:43 MSD, Michael J Gruber <git@xxxxxxxxxxxxxxxxxxxx> wrote:

> Junio C Hamano venit, vidit, dixit 23.08.2011 19:15:
> > Michael J Gruber <git@xxxxxxxxxxxxxxxxxxxx> writes:
> > 
> > > Marat Radchenko venit, vidit, dixit 23.08.2011 12:52:
> > > > > Is that a very large tree or a very slow file system?
> > > > Tree is large (500k files), file system is irrelevant since all
> > > > time is spend on CPU.
> > > > 
> > > > > Do we enumerate all
> > > > > differing files and only then limit diff output by path??
> > > > 
> > > > Dunno, that's why I am asking why it is so slow.
> > > 
> > > Well, we have to read the full tree before diffing.
> > 
> > Not necessarily, especially when pathspec is given like the original
> > post, i.e. "git diff $tree_ish -- $path". We would need to open tree
> > objects that lead to the leaf of the $path and a blob, but other
> > objects won't be needed.
> 
> I meant: The way "git diff" is now, it does that.
> 
> > 
> > The default diff backend tries to come up with minimal changes by
> > spending extra cycles, so it is not so surprising if the file compared
> > is large-ish and/or has very many similar lines in itself (in which
> > case there are many potential matching line pairs between the preimage
> > and the postimage to be examined to produce a minimal diff).
> 
> But the file in this case is not that large, and "git diff" spends 30s!

So, is some more info required from me or gprof output given in initial report + following discussion are enough to conclude what code needs to be improved?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html