On Mon, 10 Dec 2007, David Kastrup wrote: > Nicolas Pitre <nico@xxxxxxx> writes: > > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > >> I added some debug printfs which show that I > >> have a 100,000+ run of identical hash entries. Processing the 100,000 > >> entries also causes RAM consumption to explode. > > > > That is impossible. If you look at the code where those hash entries > > are created in create_delta_index(), you'll notice a hard limit of > > HASH_LIMIT (currently 64) is imposed on the number of identical hash > > entries. > > On the other hand, if we have, say, laaaaaarge streaks of zeros, what > happens is that we have 64 hashes seeing them. Now about 4096 bytes are > compared, and then the comparison stops. No. What would happen in that case is that the first hash entry pointing somewhere in the beginning of the zero chunk will match _at least_ 4096 bytes, probably much more, up to the end of the buffer if that is possible. > Then it scans backwards to > seek for more zeros (and finds about 64k of them before it stops) The first hash entry for those zeroes is going to point at an offset not greater than 15 bytes from the start of the zero area. So, unless both buffers are going to match even before that zero area, the backward scan will stop quickly. > folds them into the current compacted form. Each of these backward > scans (of which we have 64 in the worst case) is in a different memory > area. So since we scan/compare areas of 64k for each advance of 4k, we > have an overscanning factor of 16 (for a worst case scenario). Actually, if we matched 100MB of zeroes, we'll just encode that in a suite of 64KB copy operations, and continue scanning the buffer after that 100MB. So I don't see where your scan/compare areas of 64k for each advance of 4k comes from. > Not sure whether this is what we are seeing here. It would still not > explain exploding memory usage I think. Right. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html