Jeff Davis <pgsql@xxxxxxxxxxx> writes: > $ ./cmp > locale set to: en_US.UTF-8 > strcmp time elapsed: 2034183 us > strcoll time elapsed: 2019880 us It's hardly credible that you could do either strcmp or strcoll in 2 nsec on any run-of-the-mill hardware. What I think is happening is that the compiler is aware that these are side-effect-free functions and is removing the calls entirely, or at least moving them out of the loops; these times would be credible for loops consisting only of an increment, test, and branch. Integer overflow in your elapsed-time calculation is probably a risk as well --- do the reports add up to something like the actual elapsed time? I tried a modified form of your program (attached) on an FC6 machine and found that at any optimization level above -O0, that compiler optimizes the strcmp() case into oblivion, even with code added as below to try to make it look like a real operation. The strcoll() call without any following test, as you had, draws a warning about "statement with no effect" which is pretty suspicious too. With the modified program I get $ gcc -O1 -Wall cmptest.c $ time ./a.out locale set to: en_US.UTF-8 strcmp time elapsed: 0 us strcoll time elapsed: 67756363 us real 1m7.758s user 1m7.746s sys 0m0.006s $ gcc -O0 -Wall cmptest.c $ time ./a.out locale set to: en_US.UTF-8 strcmp time elapsed: 4825504 us strcoll time elapsed: 68864890 us real 1m13.692s user 1m13.676s sys 0m0.010s So as best I can tell, strcoll() is pretty dang expensive on Linux too. regards, tom lane
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <locale.h> #include <sys/time.h> #define ITERATIONS 100000000 #define THE_LOCALE "en_US.UTF-8" int main(int argc, char *argv[]) { int i; char *str1 = "abcdefghijklmnop1"; char *str2 = "abcdefghijklmnop2"; char *newlocale; struct timeval t1,t2,t3; double elapsed_strcmp,elapsed_strcoll; if( (newlocale = setlocale(LC_ALL,THE_LOCALE)) == NULL ) { printf("error setting locale!\n"); exit(1); } else { printf("locale set to: %s\n",newlocale); } gettimeofday(&t1,NULL); for(i=0; i < ITERATIONS; i++) { if (strcmp(str1,str2) == 0) printf("unexpected equality\n"); } gettimeofday(&t2,NULL); for(i=0; i < ITERATIONS; i++) { if (strcoll(str1,str2) == 0) printf("unexpected equality\n"); } gettimeofday(&t3,NULL); elapsed_strcmp = (t2.tv_sec * 1000000.0 + t2.tv_usec) - (t1.tv_sec * 1000000.0 + t1.tv_usec); elapsed_strcoll = (t3.tv_sec * 1000000.0 + t3.tv_usec) - (t2.tv_sec * 1000000.0 + t2.tv_usec); printf("strcmp time elapsed: %.0f us\n",elapsed_strcmp); printf("strcoll time elapsed: %.0f us\n",elapsed_strcoll); return 0; }