Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes: > On Sat, 10 Feb 2007, Nicolas Pitre wrote: >> > > >> > > Because git-status itself is conceptually a read-only operation, and >> > > having it barf on a read-only file system is justifiably a bug. >> > >> > I do not 100% agree that it is conceptually a read-only operation. >> >> It is. > > It really isn't. > > It's not even a "technical issue". It's a fundamental optimization. Sure, > you can call optimizations just "technical issues", but the fact is, it's > one of the things that makes git so _usable_ on large archives. At some > point, an "optimization" is no longer just about making things slightly > faster, it's about something much bigger, and has real semantic meaning. > ... > THIS IS NOT "JUST A TECHNICAL ISSUE". > ... > And the index is what makes it so. > > And that's why it's important to keep the index up-to-date. I think a one paragraph summary of your argument is: - index is a good thing -- it is what makes the difference between usable and unusable. - git-status needs to refresh the index in order to do its thing efficiently and usably _anyway_, so once it spends cycles to do so, it is senseless not to write the refreshed index out when it can. I do not think anybody disputes that in a repository with 20k+ paths, it is sensible to leave the index stat-dirty for all paths. But I think your example read-tree HEAD misses the point by stressing the importance of index too much. Index is important for the usability and I do not think anybody is disputing it. The thing is, nobody switches the index that way without running "update-index --refresh" afterwards. Normal people would use git-reset to switch to a different tree object, and the command does that for you. If you are a hardcore, you would know to use "read-tree -m HEAD" at least to avoid making paths unnecessarily stat-dirty. Your example, while it is valid and demonstrates why the index is a good thing very well, is simply not part of a normal workflow and not very relevant when discussing the performance ramifications of what state "git-status" should leave the index in. When I said "calling 'update-index --refresh' in git-status loses stat-dirtiness information", I was certainly _NOT_ talking about losing the information that 20k+ paths used to be stat-dirty because the user did "read-tree HEAD" earlier. At least for me, it is very normal to do something like this. * start from a clean index. * edit cache.h, diff.h, and diff-lib.c. * stop, think, and realize that my earlier edit to change one function prototype in diff.h was not needed, and revert the change to that line still in the editor. * fix things up further by editing other files. And then, I would run "git diff" to see where I am. I still remember that I touched diff.h and I also remember that I once changed a function prototype but then decided the change was not necessary after all, but I do not remember if I changed anything else in the file. It is _very_ assuring to see the emptiness that follows "git diff --git" header for diff.h in such a case. Seeing the path to be stat-dirty is a very good thing for me, because otherwise I might lose a few seconds thinking that what I thought I touched might have been cache.h and not diff.h. To me, running "git status" is "wrapping things up" step. I do not need that stat-dirty assurance "git diff" gave me at that point. Not seeing diff.h in "modified but updated" list is a good thing. And in my workflow, after that 'wrapping things up" step, I do not need that stat-dirty assurance _anymore_. I think Nico is correct to point out that "not _anymore_" part of the above reasoning of mine assumes _my_ workflow and preference, and I think that is a valid point. Not saving the refreshed index would make the stat-dirtiness for diff.h to come back, which would be inconvenient and annoying to me. But the user might want to keep it stat-dirty after running "git-status". People in "not _anymore_" camp like me can throw the stat-dirtiness away by "update-index --refresh". I do not think he (or anybody) is advocating to keep 20k+ paths in stat-dirty state (arguably, "artificially" due to use of "read-tree HEAD"), so your example using "read-tree HEAD" only confuses the discussion. Having said all that, I do agree with you that git-status should throw that stat-dirtiness information away by saving the refreshed index. Doing otherwise is annoying to me as I already said, and I do not think of a valid reason for the user to want to keep stat-dirtiness information after running "git-status", because to me the whole point of running "git-status" is to start wrapping things up. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html