On Mon, 22 June 2009 20:31:10 +0100, Chris Simmonds wrote: > > I disagree: that adds an unnecessary overhead for those architectures > where the cpu byte order does not match the data structure ordering. I > think the data structures should be native endian and when mkpramfs is > written it can take a flag (e.g. -r) in the same way mkcramfs does. Just to quantify this point, I've written a small crap program: #include <stdio.h> #include <stdint.h> #include <byteswap.h> #include <sys/time.h> long long delta(struct timeval *t1, struct timeval *t2) { long long delta; delta = 1000000ull * t2->tv_sec + t2->tv_usec; delta -= 1000000ull * t1->tv_sec + t1->tv_usec; return delta; } #define LOOPS 100000000 int main(void) { long native = 0; uint32_t narrow = 0; uint64_t wide = 0, native_wide = 0; struct timeval t1, t2, t3, t4, t5; int i; gettimeofday(&t1, NULL); for (i = 0; i < LOOPS; i++) native++; gettimeofday(&t2, NULL); for (i = 0; i < LOOPS; i++) narrow = bswap_32(bswap_64(narrow) + 1); gettimeofday(&t3, NULL); for (i = 0; i < LOOPS; i++) native_wide++; gettimeofday(&t4, NULL); for (i = 0; i < LOOPS; i++) wide = bswap_64(bswap_64(wide) + 1); gettimeofday(&t5, NULL); printf("long: %9lld us\n", delta(&t1, &t2)); printf("we32: %9lld us\n", delta(&t2, &t3)); printf("u64: %9lld us\n", delta(&t3, &t4)); printf("we64: %9lld us\n", delta(&t4, &t5)); printf("loops: %9d\n", LOOPS); return 0; } Four loops doing the same increment with different data types: long, u64, we32 (wrong-endian) and we64. Compile with _no_ optimizations. Results on my i386 notebook: long: 453953 us we32: 880273 us u64: 504214 us we64: 2259953 us loops: 100000000 Or thereabouts, not completely stable. Increasing the data width is 10% slower, 32bit endianness conversions is 2x slower, 64bit conversion is 5x slower. However, even the we64 loop still munches through 353MB/s (100M conversions in 2.2s, 8bytes per converion. Double the number if you count both conversion to/from wrong endianness). Elsewhere in this thread someone claimed the filesystem peaks out at 13MB/s. One might further note that only filesystem metadata has to go through endianness conversion, so on this particular machine it is completely lost in the noise. Feel free to run the program on any machine you care about. If you get numbers to back up your position, I'm willing to be convinced. Until then, I consider the alleged overhead of endianness conversion a prime example of premature optimization. Jörn -- Joern's library part 7: http://www.usenix.org/publications/library/proceedings/neworl/full_papers/mckusick.a -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html