On Tue, 17 Oct 2006, Linus Torvalds wrote: > > > On Tue, 17 Oct 2006, Nicolas Pitre wrote: > > > > Because offsets into packs are expressed as unsigned long everywhere > > else (except in the current pack index on-disk format). > > Until your work, that "unsigned long" was totally just an internal thing > that didn't actually bleed into anything else. And would you please explain how my work changes that state of affairs? Sorry but I don't follow you here. Still _I_ wrote that code. > > > For some structure like this, it sounds positively wrong. Pack-files > > > should be architecture-neutral, which means that they shouldn't depend on > > > word-size, and they should be in some neutral byte-order. > > > > But they do. Please consider this code: > > Right. The pack-file itself. But the code that actually _generates_ it > mixes things in alarming ways. ??? > > > In contrast, the new union introduced in "next" is just horrid. There's > > > not even any way to know which member to use, except apparently that it > > > expects that a SHA1 is never zero in the last 12 bytes. Which is probably > > > true, but still - that's some ugly stuff. > > > > This union should be looked at just like a sortable hash pointing to a > > base object so that deltas with the same base object can be sorted > > together. > > .. and it sorts _differently_ on a big-endian vs little-endian thing, > doesn't it? Sure. But who cares? The sorting is just there to 1) perform binary searches on the list of deltas based from a given object, and 2) find a list of all deltas with the same base object. > So now the sort order depends on endianness and/or wordsize. That just > sounds really really wrong. Again, who cares? That ordering doesn't influence any data produced by the tool. It is an internal and private strategy to speed up the _local_ _searching_ process. It could be replaced by a dumb linear list walk if you wish and the end result i.e. the produced pack index would be exactly the same (with a significant slowdown notwitstanding). So let me summarize: - the union is a hash. - the hash is either an offset value or a sha1 digest. - this hash is used for fast object lookup _only_. - it does sort differently on big vs little endian machines. - but we don't care at all because - it is a private algorithmic thing that doesn't "bleed" into any _real_ data structure, and - it doesn't have any influence on the format of the end result. - it is only a runtime abstraction and nothing else. - It never gets into the pack nor the pack index themselves. Do you still have issues with that? Nicolas - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html