On Wed, Jul 08, 2020 at 02:06:31PM -0700, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > It's probably possible to teach the grep code to do the same > > check-in-the-index trick, but I'm not sure how complicated it would be. > > I am not sure if we should even depend on the "check the object > database and use it instead of reading the working tree files" done > in diff code---somehow I thought we did the opposite for performance > (i.e. when we ought to be comparing two objects, taken from tree and > the index, if we notice that the index side is stat clean, we can > read/mmap the working tree file instead of going to the object layer > and deflating a loose object, or, worse yet, construct the blob by > repeatedly applying deltas on a base object in a packfile). > > Is this one in the opposite direction done specifically for gaining > performance when textconv cache is in use? If so, kudos to whoever > did it---that sounds like a clever thing to do. No, it turns out that nobody was that clever (and I was simply misremembering how it worked). For a tree-to-tree or index-to-tree comparison, both sides will have an oid and can use the textconv cache. Even for an index case where we might choose to use a stat-fresh working tree file as an optimization, we'll still consult the textconv cache before loading those contents. But for diffing a file in the working tree, we'll never have an oid and will always run the textconv command). So "git diff" against the index, for example, would run _one_ textconv (using the cached value for the index, and running one for the working tree version). And we know that isn't that interesting for optimizing, since by definition the file is stat-dirty in that case (or else we'd skip the content-level comparison entirely). So you'd have to compute the sha1 of the working tree file from scratch. Plus the lifetime of a working tree file's entry in the textconv cache is probably smaller, since it hasn't even been committed yet. I don't think I ever noticed because the primary thing I was trying to speed up with the textconv cache is "git log -p", etc, which always has an oid to work with. But "grep" is a totally different story. It is frequently looking at all of the stat-fresh working tree files. -Peff