On Tue, 17 Oct 2006, Linus Torvalds wrote: > > > On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > On Tue, 17 Oct 2006, Sergey Vlasov wrote: > > > > > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs: > > > > Ah bummer. Then this is most likely the cause. And here's a simple > > fix (Junio please confirm): > > Why do you use "unsigned long" in the first place? Because offsets into packs are expressed as unsigned long everywhere else (except in the current pack index on-disk format). > For some structure like this, it sounds positively wrong. Pack-files > should be architecture-neutral, which means that they shouldn't depend on > word-size, and they should be in some neutral byte-order. But they do. Please consider this code: case OBJ_OFS_DELTA: memset(delta_base, 0, sizeof(*delta_base)); c = pack_base[pos++]; base_offset = c & 127; while (c & 128) { base_offset += 1; if (!base_offset || base_offset & ~(~0UL >> 7)) bad_object(offset, "offset value overflow for delta base object"); if (pos >= pack_limit) bad_object(offset, "object extends past end of pack"); c = pack_base[pos++]; base_offset = (base_offset << 7) + (c & 127); } delta_base->offset = offset - base_offset; if (delta_base->offset >= offset) bad_object(offset, "delta base offset is out of bound"); break; Do you see anything inerently wrong in this code? The above is already 64-bit ready such that it'll just work on 64-bit archs and will display a sensible message if a 32-bit arch encounter a pack larger than 4GB. But the on-disk pack format has no limitation what so ever. > Quite frankly, this all makes me go "Eww..". The original pack-file (well, > v2) format was well-defined and had none of these issues. In contrast, the > new code in 'next' is just _ugly_. I beg to differ. Please reconsider in light of the above. > And maybe it's just me, but I consider unions to be bug-prone on their > own. The "master" branch has exactly two unions: the "grep_expr" structure > contains one (where the union member is clearly defined by the node type > in that structure), and object.c has a "union any_object" that _literally_ > exists as purely an allocation size issue (ie it is used _only_ to > allocate the maximum size of any of the possible structures). > > In contrast, the new union introduced in "next" is just horrid. There's > not even any way to know which member to use, except apparently that it > expects that a SHA1 is never zero in the last 12 bytes. Which is probably > true, but still - that's some ugly stuff. This union should be looked at just like a sortable hash pointing to a base object so that deltas with the same base object can be sorted together. And the field to use is well defined of course: deltas with sha1 to base use the sha1 member, deltas with offset to base use the offset member. This hash, together with the delta type, constitute a tuple guaranteed to be unique so there can't be any confusion. > Is this something you want to bet a big project on? I don't see why not. Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html