On 10/01/2011 10:41 PM, Junio C Hamano wrote: > Martin Fick <mfick@xxxxxxxxxxxxxx> writes: >> I guess this makes sense, we invalidate the cache and have >> to rebuild it after every new ref is added? Perhaps a >> simple fix would be to move the invalidation right after all >> the refs are updated? Maybe write_ref_sha1 could take in a >> flag to tell it to not invalidate the cache so that during >> iterative updates it could be disabled and then run manually >> after the update? > > It might make sense, on top of Julian's patch, to add a bit that says "the > contents of this ref-array is current but the array is not sorted", and > whenever somebody runs add_ref(), append it also to the ref-array (so that > the contents do not have to be re-read from the filesystem) but flip the > "unsorted" bit on. Then update look-up and iteration to sort the array > when "unsorted" bit is on without re-reading the contents from the > filesystem. My WIP patch series does one better than this; it keeps track of what part of the array is already sorted so that a reference can be found in the sorted part of the array using binary search, and if it is not found there a linear search is done through the unsorted part of the array. I also have some code (not pushed) that adds some intelligence to make the use case repeat many times: check if reference exists add reference efficient by picking optimal intervals to re-sort the array. (This sort can also be faster if most of the array is already sorted: sort the new entries using qsort then merge sort them into the already-sorted part of the list.) But there is another reason that we cannot currently update the reference cache on the fly rather than invalidating it after each change: symbolic references are stored *resolved* in the reference cache, and no record is kept of the reference that they refer to. Therefore it is possible that the addition or modification of an arbitrary reference can affect how a symbolic reference is resolved, but there is not enough information in the cache to track this. IMO the correct solution is to store symbolic references un-resolved. Given that lookup is going to become much faster, the slowdown in reference resolution should not be a big performance penalty, whereas reference updating could become *much* faster. Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html