Am 19.07.2014 18:43, schrieb brian m. carlson:
On Sat, Jul 19, 2014 at 02:11:30PM +0200, René Scharfe wrote:
I'd say if a platform doesn't bother optimizing memcmp() then they
deserve the resulting performance. And it's probably not too bad a
penalty because such comparisons probably won't make up a significant
part of most applications.
I tend to agree with this. On many modern versions of GCC, the compiler
can generate an appropriately optimized inline version when it sees a
memcmp call, so it's more of a compiler issue then, since no actual call
to the function will be emitted.
I just found this open GCC bug entry about glibc memcmp being faster
than the inlined version of the compiler:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052.
(Found through
http://randomascii.wordpress.com/2012/10/31/comparing-memory-is-still-tricky/,
which says that the compilers coming with Microsoft Visual Studio 2010
and 2012 are not optimizing memcmp() as much as they could as well.)
static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
{
+ const uint32_t *p1 = (const uint32_t *)sha1;
+ const uint32_t *p2 = (const uint32_t *)sha2;
You can't make this cast. The guaranteed alignment for sha1 and sha2 is
1, and for p1 and p2, it's 4. If sha1 and sha2 are not suitably
aligned, this will get a SIGBUS on sparc and possibly a wrong value on
ARM[0].
[0] http://www.aleph1.co.uk/chapter-10-arm-structured-alignment-faq
Yeah, it was just a test balloon that happens to work on amd64. We
could invent a hash type with correct alignment (a struct with a
uint32_t[5] member?) and replace all those unsigned char pointers if we
wanted to go with such a "vectorized" hashcmp, but that would be
maximally invasive.
René
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html