Martin Koegler <martin.koegler@xxxxxxxxx> writes: > From: Martin Koegler <martin.koegler@xxxxxxxxx> Just a nitpick on the patch title. As "git shortlog --no-merges" output would tell you, we try to prefix the title with a short name of the area of codebase we are touching, followed by a colon and a space and then remainder without extra capitalization. Perhaps Subject: delta: fix enconding size larger than an "uint" can hold > The current delta code produces incorrect pack objects for files > 4GB. > > Signed-off-by: Martin Koegler <martin.koegler@xxxxxxxxx> I am a bit torn on this change. The original is indeed bad in that the code does not guarantee that an intermediate variable like 'l' is not large enough to hold the true size we know in index->src_size, and in that sense this change is an improvement. Given that this is not merely a local storage format but it also is an interchange format, we would probably want to make sure that the receiving end (e.g. get_delta_hdr_size() that is used at the beginning of patch_delta()) on a platform whose size_t is smaller than that of a platform that produced the delta stream with this code behaves "sensibly". If we replaced ulong we use in create/patch delta codepaths with uint32_t, that would be safer, just because the encoder would not be able to emit varint that is larger than the receivers to handle. But that defeats the whole point of using varint() to encode the sizes in the first place. It was partly done for space saving, but more for allowing larger sizes and larger ulong in the future without having to change the file format. Perhaps we should teach the receiving end to notice that the varint data it reads encodes a size that is too large for it to grok and die. With that, we can safely move forward with whatever size_t each platform uses. Thanks. > --- > For next. > > diff-delta.c | 24 +++++++++++++----------- > 1 file changed, 13 insertions(+), 11 deletions(-) > > diff --git a/diff-delta.c b/diff-delta.c > index 3797ce6..cd238c8 100644 > --- a/diff-delta.c > +++ b/diff-delta.c > @@ -319,7 +319,9 @@ create_delta(const struct delta_index *index, > const void *trg_buf, unsigned long trg_size, > unsigned long *delta_size, unsigned long max_size) > { > - unsigned int i, outpos, outsize, moff, msize, val; > + unsigned int i, val; > + off_t outpos, moff; > + size_t l, outsize, msize; > int inscnt; > const unsigned char *ref_data, *ref_top, *data, *top; > unsigned char *out; > @@ -336,20 +338,20 @@ create_delta(const struct delta_index *index, > return NULL; > > /* store reference buffer size */ > - i = index->src_size; > - while (i >= 0x80) { > - out[outpos++] = i | 0x80; > - i >>= 7; > + l = index->src_size; > + while (l >= 0x80) { > + out[outpos++] = l | 0x80; > + l >>= 7; > } > - out[outpos++] = i; > + out[outpos++] = l; > > /* store target buffer size */ > - i = trg_size; > - while (i >= 0x80) { > - out[outpos++] = i | 0x80; > - i >>= 7; > + l = trg_size; > + while (l >= 0x80) { > + out[outpos++] = l | 0x80; > + l >>= 7; > } > - out[outpos++] = i; > + out[outpos++] = l; > > ref_data = index->src_buf; > ref_top = ref_data + index->src_size;