On Wed, 18 Oct 2006, Linus Torvalds wrote: > > > On Wed, 18 Oct 2006, Davide Libenzi wrote: > > > > The hash value (hence the hash bucket index) simply directs you to the > > bucket where a real record-compare loop is performed. > > As far as I can tell not all loops do a real "record-compare" thing. > > Some of the hash loops _only_ look at the hash, and as such a bad hash > will do more than just cause bad performance, it will actually degrade the > diff itself. Isn't that what XDL_MAX_EQLIMIT effectively does? The XDL_MAX_EQLIMIT is used to limit the search for equal records, in the record-discard phase. Note though, that at that point that "ha" value is a record-class ID (every different record/line in the input has a unique ID). Look at what xdl_classify_record() does. So in that case, XDL_HASHLONG can really simply be a bitmask. So comparing "ha" in the loop in there, does actually the right thing in any case (equal "ha" means really equal record). > Btw, the binary delta generator doesn't seem to have this issue at all: it > uses "unsigned int" for the hash values, so the xdiff delta generation > will give the same exact results on 32-bit and 64-bit architectures. > > Or was that one of the changes by Nico? (I only looked at the git version > of that code) The binary diff in libxdiff uses a chaining hash, so even in that case it wouldn't have made a difference. I think Nico changed the hash to be a coalesced hash, and in that case it does change the output. - Davide - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html