On Fri, May 01, 2009 at 11:19:04AM +0200, Kjetil Barvik wrote: > * Steven Noonan <steven@xxxxxxxxxxxxxx> writes: > | On Thu, Apr 30, 2009 at 2:36 PM, Kjetil Barvik <barvik@xxxxxxxxxxxx> wrote: > |> * "Shawn O. Pearce" <spearce@xxxxxxxxxxx> writes: > |> |> 4) The "static inline void hashcpy(....)" in cache.h could then > |> |> maybe be written like this: > |> | > |> | Its already done as "memcpy(a, b, 20)" which most compilers will > |> | inline and probably reduce to 5 word moves anyway. That's why > |> | hashcpy() itself is inline. > |> > |> But would the compiler be able to trust that the hashcpy() is always > |> called with correct word alignment on variables a and b? > > <snipp> > > | Well, I just tested this with GCC myself. I used this segment of code: > | > | #include <memory.h> > | void hashcpy(unsigned char *sha_dst, const unsigned char *sha_src) > | { > | memcpy(sha_dst, sha_src, 20); > | } > > OK, here is a smal test, which maybe shows at least one difference > between using "unsigned char sha1[20]" and "unsigned long sha1[5]". > Given the following file, memcpy_test.c: > > #include <string.h> > extern void hashcpy_uchar(unsigned char *sha_dst, const unsigned char *sha_src); > void hashcpy_uchar(unsigned char *sha_dst, const unsigned char *sha_src) > { > memcpy(sha_dst, sha_src, 20); > } > extern void hashcpy_ulong(unsigned long *sha_dst, const unsigned long *sha_src); > void hashcpy_ulong(unsigned long *sha_dst, const unsigned long *sha_src) > { > memcpy(sha_dst, sha_src, 5); > } > > And, compiled with the following: > > gcc -O2 -mtune=core2 -march=core2 -S -fomit-frame-pointer memcpy_test.c > > It produced the following memcpy_test.s file: > > .file "memcpy_test.c" > .text > .p2align 4,,15 > .globl hashcpy_ulong > .type hashcpy_ulong, @function > hashcpy_ulong: > movl 8(%esp), %edx > movl 4(%esp), %ecx > movl (%edx), %eax > movl %eax, (%ecx) > movzbl 4(%edx), %eax > movb %al, 4(%ecx) > ret > .size hashcpy_ulong, .-hashcpy_ulong > .p2align 4,,15 > .globl hashcpy_uchar > .type hashcpy_uchar, @function > hashcpy_uchar: > movl 8(%esp), %edx > movl 4(%esp), %ecx > movl (%edx), %eax > movl %eax, (%ecx) > movl 4(%edx), %eax > movl %eax, 4(%ecx) > movl 8(%edx), %eax > movl %eax, 8(%ecx) > movl 12(%edx), %eax > movl %eax, 12(%ecx) > movl 16(%edx), %eax > movl %eax, 16(%ecx) > ret > .size hashcpy_uchar, .-hashcpy_uchar > .ident "GCC: (Gentoo 4.3.3-r2 p1.1, pie-10.1.5) 4.3.3" > .section .note.GNU-stack,"",@progbits > > So, the "unsigned long" type hashcpy() used 7 instructions, compared > to 13 for the "unsigned char" type hascpy(). But your "unsigned long" version only copies 5 bytes... Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html