On Sat, 25 Feb 2006, Nicolas Pitre wrote: > > Yes, the hash is larger. There is a cost in memory usage but not really > in CPU cycles. Note that memory usage translates almost 1:1 (or worse) to CPU cycles in almost all real-life behaviours. Only in carefully tuned benchmarks does it not. Increased memory usage means more paging, and worse cache behaviour. Now, hashes aren't wonderful for caches in the first place, but imagine the hump you pass when the data doesn't fit in a 64kB L1 any more (or a 256kB L2). Huge. > > You'll find a lot of that in any file: three or four bytes of similarity > > just doesn't sound worthwhile to go digging after. > > Well after having experimented a lot with multiple parameters I think > they are worth it after all. Not only they provide for optimal deltas, > but their hash is faster to compute than larger blocks which seems to > counter balance for the cost of increased hash list. Hey, numbers talk. If you've got the numbers, I'll just shut up ;) Linus - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html