Junio, Thanks a lot for your thorough explanation.. Patrick On Tue, Apr 14, 2009 at 16:05, Junio C Hamano <gitster@xxxxxxxxx> wrote: > Johannes Schindelin <Johannes.Schindelin@xxxxxx> writes: > >> On Tue, 14 Apr 2009, Patrick Berkeley wrote: >> >>> Does Git track the deltas on binary files? >>> >>> Someone in #git mentioned that if the binaries change too much Git no >>> longer just stores the changes. If this is the case, what is the >>> breaking point where Git goes from storing the deltas to the entire >>> new file? >> >> Git does not store the deltas as you think it does. The deltification of >> the objects is almost independent from the commmit history, i.e. we >> _always_ store snapshots for most practical matters. > > Always store snapshots sounds as if you are not storing delta at all. I > think I know what you meant to say, but the way you phrased it is > misleading. > > Documentation/technical/pack-heuristics.txt talks about this in some > detail. A short version is: > > - It does not make a difference if you are dealing with binary or text; > > - The delta is not necessarily against the same path in the previous > revision, so even a new file added to the history can be stored in a > delitified form; > > - When an object stored in the deltified representation is used, it would > incur more cost than using the same object in the compressed base > representation. The deltification mechanism makes a trade-off taking > this cost into account, as well as the space efficiency. > > The last point may probably be not covered by pack-heuristics IRC talk > Linus had in the documentation. Basically: > > - A deltified object is stored as an (compressed) xdelta against some > base object. If the best deltified representation we come up with is > larger than the result of just compressing the object without > deltification, it is not worth storing it from the space comsumption > point of view. Thus, we originally said something like "if an > attempted delta is larger than half of the object size (assuming > average 50% of compression ratio), do not use the deltified > representation, it is not worth it". We attempt to delta against many > base objects to pick the best possible delta; the number of attempt is > called the delta window. > > - The base object of a deltified object could also be deltified, and you > may need to repeatedly apply delta on top of some object that is not a > delta to get to the final object. The length of this chain is called > delta depth, and obviously you would want to keep the delta depth short > to gain a reasonable runtime performance. Thus, when delitifying one > object A, we make a weighted comparison between the size of the delta > to build it out of an object of depth N and the size of the delta to > build it out of an object of depth M. A slightly larger delta that is > based on an object with a shallower delta depth is favored over a > smaller delta based on an object with a much deeper delta depth. > > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html