On Fri, 2007-05-04 at 13:52 -0400, Tom Lane wrote: > Jeff Davis <pgsql@xxxxxxxxxxx> writes: > > I used strcmp() and strcoll() in a tight loop, and the result was > > indistinguishable. > > That's not particularly credible ... were you testing this in a > standalone test program? If so, did you remember to do setlocale() > first? Without that, you'll be in C locale regardless of environment > contents. I have attached a revised cmp.c that includes some extra checks. It looks like the locale is being set correctly and still I don't see a difference. ------------------------------------------------------------------------ $ gcc --version gcc (GCC) 3.4.5 20051201 (Red Hat 3.4.5-2) $ uname -a _____________ 2.6.9-34.ELsmp #1 SMP Wed Mar 8 00:27:03 CST 2006 i686 i686 i386 GNU/Linux $ ./cmp locale set to: en_US.UTF-8 strcmp time elapsed: 2034183 us strcoll time elapsed: 2019880 us ------------------------------------------------------------------------ If I had to guess, I'd say maybe strcoll() optimizes the simple cases somehow. [ checks FreeBSD ... ] On FreeBSD, it's a different story. strcoll() really hurts there (painfully so). I'm glad you pointed that out, because I have my production boxes on FreeBSD. Regards, Jeff Davis
#include <stdio.h> #include <string.h> #include <locale.h> #include <sys/time.h> #define ITERATIONS 1000000000 #define THE_LOCALE "en_US.UTF-8" int main(int argc, char *argv[]) { int i; char buff11[256]; char buff12[256]; char *buff21; char *buff22; char *str1 = "abcdefghijklmnop1"; char *str2 = "abcdefghijklmnop2"; char *newlocale; struct timeval t1,t2,t3; int elapsed_strcmp,elapsed_strcoll; int len1 = strlen(str1); int len2 = strlen(str2); if( (newlocale = setlocale(LC_ALL,THE_LOCALE)) == NULL ) { printf("error setting locale!\n"); exit(1); } else { printf("locale set to: %s\n",newlocale); } gettimeofday(&t1,NULL); for(i=0; i < ITERATIONS; i++) { strcmp(str1,str2); } gettimeofday(&t2,NULL); for(i=0; i < ITERATIONS; i++) { strcoll(str1,str2); } gettimeofday(&t3,NULL); elapsed_strcmp = (t2.tv_sec * 1000000 + t2.tv_usec) - (t1.tv_sec * 1000000 + t1.tv_usec); elapsed_strcoll = (t3.tv_sec * 1000000 + t3.tv_usec) - (t2.tv_sec * 1000000 + t2.tv_usec); printf("strcmp time elapsed: %d us\n",elapsed_strcmp); printf("strcoll time elapsed: %d us\n",elapsed_strcoll); }