Bruce Momjian wrote: > Greg Smith wrote: >> Bruce Momjian wrote: >>> I thought our only problem was testing the I/O subsystem --- I never >>> suspected the file system might lie too. That email indicates that a >>> large percentage of our install base is running on unreliable file >>> systems --- why have I not heard about this before? >>> >> he reason why it >> doesn't bite more people is that most Linux systems don't turn on write >> barrier support by default, and there's a number of situations that can >> disable barriers even if you did try to enable them. It's still pretty >> unusual to have a working system with barriers turned on nowadays; I >> really doubt it's "a large percentage of our install base". > > Ah, so it is only when write barriers are enabled, and they are not > enabled by default --- OK, that makes sense. The test program I linked up-thread shows that fsync does nothing unless the inode's touched on an out-of-the-box Ubuntu 9.10 using ext3 on a straight from Dell system. Surely that's a common config, no? If I uncomment the fchmod lines below I can see that even with ext3 and write caches enabled on my drives it does indeed wait. Note that EXT4 doesn't show the problem on the same system. Here's a slightly modified test program that's a bit easier to run. If you run the program and it exits right away, your system isn't waiting for platters to spin. //////////////////////////////////////////////////////////////////// /* ** based on http://article.gmane.org/gmane.linux.file-systems/21373 ** http://thread.gmane.org/gmane.linux.kernel/646040 ** If this program returns instantly, the fsync() lied. ** If it takes a second or so, fsync() probably works. ** On ext3 and drives that cache writes, you probably need ** to uncomment the fchmod's to make fsync work right. */ #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <stdio.h> #include <stdlib.h> int main(int argc,char *argv[]) { if (argc<2) { printf("usage: fs <filename>\n"); exit(1); } int fd = open (argv[1], O_RDWR | O_CREAT | O_TRUNC, 0666); int i; for (i=0;i<100;i++) { char byte; pwrite (fd, &byte, 1, 0); // fchmod (fd, 0644); fchmod (fd, 0664); fsync (fd); } } //////////////////////////////////////////////////////////////////// ron@ron-desktop:/tmp$ /usr/bin/time ./a.out foo 0.00user 0.00system 0:00.01elapsed 21%CPU (0avgtext+0avgdata 0maxresident)k -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance