On 04/28/2011 10:18 AM, Bernhard R. Link wrote: > * Ralf Baechle<ralf@xxxxxxxxxxxxxx> [110428 02:35]: >> On Wed, Apr 27, 2011 at 04:32:12PM -0700, Junio C Hamano wrote: >> >>>> +static inline int is_null_sha1(const unsigned char *sha1) >>>> { >>>> - return memcmp(sha1, sha2, 20); >>>> + const unsigned long long *sha1_64 = (void *)sha1; >>>> + const unsigned int *sha1_32 = (void *)sha1; >>> >>> Can everybody do unaligned accesses just fine? >> >> Misaligned accesses cause exceptions on some architectures which then >> are fixed up in software making these accesses _very_ slow. You can >> use __attribute__((packed)) to work around that but that will on the >> affected architectures make gcc generate code pessimistically that is >> slower than not using __attribute__((packed)) in case of proper >> alignment. And __attribute__((packed)) only works with GCC. > > Even __attribute__((packed)) usually does not allow arbitrary aligned > data, but can intruct the code to generate code to access code > misaligned in a special way. (I have already seen code where thus > accessing a properly aligned long caused a SIGBUS, because it was > aligned because being in a misaligned packed struct). > > In short: misaligning stuff works on x86, everywhere else it is disaster > waiting to happen. (And people claiming compiler bugs or broken > architectures, just because they do not know the basic rules of C). > Given that the vast majority of user systems are x86 style ones, it's probably worth using this patch on such systems and stick to a partially unrolled byte-by-byte comparison that finishes early on the rest of them. Properly pipelined, it will just mean that the early return undoes the fetch steps for the 3-4 unrolled bytes that it computes in advance, so if the diff comes in the first 10-12 bytes, it will still be a win. For bonus points, check if both bytestrings are equally (un)aligned first and, if they are, half-Duff it out with a fallthrough switch statement (without the while() loop) to compare byte-by-byte first and then word-for-word on the rest of it. The setup and complexity is probably not worth it for our meager 20-byte strings though. -- Andreas Ericsson andreas.ericsson@xxxxxx OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html