On Thu, 17 Jul 2008, Shawn O. Pearce wrote: > Its rare that we should exceed deltaBaseCacheLimit while resolving > delta compressed objects. By default this limit is 16M, and most > chains are under 50 objects in length. This affords about 327K per > object in the chain, which is quite large by source code standards. > > If we have to recreate a prior delta base because we evicted it to > stay within the deltaBaseCacheLimit we can warn the user that their > configured limit is perhaps too low for this repository data set. > If the user keeps seeing the warning they can research it in the > documentation, and consider setting it higher on this repository, > or just globally on their system. As I said earlier, I don't think this is a good idea, but I'll elaborate a bit more. First, this is a really bad clue for setting deltaBaseCacheLimit. The likelyhood of this warning to actually show up during an initial clone is relatively high, yet this doesn't mean that deltaBaseCacheLimit has to be changed at all. For one, the real time usage of deltaBaseCacheLimit is to cap a cache of objects for multiple delta chains with random access, and not only one chain traversed linearly like in the index-pack case, and that cache is likely to always be full and in active eviction mode -- that's the point of a cap after all. In the index-pack this is only used to avoid excessive memory usage for intermediate delta results and not really a cache. In other words, we have two rather different usages for the same settings. Now don't read me wrong: I think that reusing this setting is sensible, but its value should not be determined by what index-pack may happen to do with it, especially on a first clone. And issuing warnings on the first clone is not the way to give new users confidence either. Secondly, on subsequent fetches, the warning is likely to never appear again due to the fact that the delta chains will typically be much shorter. And that would be true even if in reality the runtime access to the repository would benefit a lot from deltaBaseCacheLimit being raised. And it is the runtime access which is important here, not the occasional fetch. Yet the full delta chains are not likely to be walked in their entirety very often anyway either. Thirdly, if such indication is considered useful, then it should really be part of some statistic/analysis tool, such as verify-pack for example. Such a tool could compute the exact memory requirements for a given repository usage and possibly provide suggestions as to what the optimal deltaBaseCacheLimit value could be. But yet that cache has a hardcoded number of entries at the moment and its hash function might not be optimal either, making the connection with index-pack even more apart. And finally, I think that index-pack would benefit a lot from a really simple optimization which is to free the resulting intermediate delta base object right away when there is only one delta child to resolve, before that child is itself used as a base for further delta grand-children. That is likely to cover most cases of big delta chains already, making that warning an even worse indicator. > Suggested-by: Stephan Hennig <mailing_list@xxxxxxxx> > Signed-off-by: Shawn O. Pearce <spearce@xxxxxxxxxxx> Unrecommended-by: Nicolas Pitre <nico@xxxxxxx> Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html