On Thu, Aug 10, 2017 at 01:49:24PM -0700, Junio C Hamano wrote: > The lower 4-byte of moff (before incrementing it with msize) were > already encoded to the output stream before this hunk. Shouldn't > we be checking if moff would fit in uint32_t _before_ that happens? moff is otherwise only decremented or assigned with an offset generated by create_delta_index. These offsets are limited by 4GB. Any larger offets would be a programming bug - so qualify for just a "assert". > IOW, in a much earlier part of that "while (data < top)" loop, there > is a code that tries to find a match that gives us a large msize by > iterating over index->hash[], and it sets msize and moff as a potential > location that we may want to use. If moff is already big there, then > we shouldn't consider it a match because we cannot encode its location > using 4-byte anyway, no? Recovering from an incorrect moff would add more complex code for a case, that should not happen. A bailout would be sufficient. > Cutting it off at here by resetting msize to 0 might help the next > iteration (I didn't check, but is the effect of it is to corrupt the > "val" rolling checksum and make it unlikely that the hash > computation would not find a correct match?) but it somehow feels > like closing the barn door after the horse has already bolted... The current code produces incorrect deltas - its not just a checksum issue. By the way: Somebody interested in JGIT should also look at these two bugs: https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/DeltaEncoder.java copy would also encode beyond 4GB - producing truncated delta offset. https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/BinaryDelta.java apply uses int for decoding length values. Regards, Martin