Jeff King <peff@xxxxxxxx> writes: > Related to this, I have wondered if it might be useful to have an "index > reflog". If I do something like this: > > $ git add foo > $ hack hack hack > $ git add foo > > Then the first added state of "foo" is available in the object database, > but it is not connected to the name "foo" in any way, which makes it > much harder to find. If we had a reflog pointing to trees representing > the index state after each change, then it would be simple (you could > look at "INDEX@{1}:foo" or similar). > > I don't know if the performance is an issue. We are writing an extra > tree every time we touch the index, but in many cases you are already > writing a blob. It is not just "an extra tree every time". For example, in the kernel repository, one of the path that is deepest [*1*] (i.e. whose modification affects the most number of trees) is: arch/cris/include/arch-v32/mach-a3/mach/hwregs/iop/asm/iop_reg_space_asm.h If you modify this file and then "git add", and if you write-tree the index at that point, you need to write a tree object for ".", arch/, arch/cris, ..., arch/cris/include/arch-v32/mach-a3/mach/hwregs/iop/asm, 10 trees in total (if I am counting them right ;-). If your cache-tree is fresh (and if you "git write-tree" every time you "git add", that will make it stay fresh), you do not have to recompute object names of other 1728 tree objects (they are unchanged) [*2*], which should help somewhat, but the majority of time is spent in the I/O (and perhaps slow fsync on ext3 ;-) of writing these 10 tree objects [*3*]. People like Shawn who work with Java projects, where the tree hierarchy tends to be (unnecessary) deep with prefixes like org/spearce/jgit due to the namespace issues will have bigger overhead than a relatively shallow project like git.git itself. [Footnotes] *1* You can find it out yourself with... git ls-files "$( git ls-files | sed -e 's|[^/]||g' | sort -u | tail -n 1 | sed -e 's|/|*/|g' -e 's/$/*/' )" | head -n 1 *2* The total number of tree objects in a commit is... echo $(git ls-tree -r -d HEAD | wc -l) 1 + p | dc *3* write-tree with or without help from cache-tree in the kernel repository with a hot cache (we are talking about running "git write-tree" every time you do "git add" so the cold cache case does not matter) looks like this: $ l=arch/cris/include/arch-v32/mach-a3/mach/hwregs/iop/asm/iop_reg_space_asm.h $ echo >>$l && git add $l $ /usr/bin/time git write-tree 04bc92c40a5d0f0d44e162e140cb00964a52046b 0.02user 0.01system 0:00.03elapsed 102%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6387minor)pagefaults 0swaps $ git reset --hard $ echo >>$l && git add $l $ /usr/bin/time git write-tree --ignore-cache-tree 04bc92c40a5d0f0d44e162e140cb00964a52046b 0.13user 0.04system 0:00.17elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+5336outputs (0major+17141minor)pagefaults 0swaps (The numbers are from my Athlon(tm) 64 X2 3800+ with slow IDE disks). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html