Search Postgresql Archives

Re: Maximum transaction rate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ron Mayer wrote:
> Greg Smith wrote:
>> There are some known limitations to Linux fsync that I remain somewhat
>> concerned about, independantly of LVM, like "ext3 fsync() only does a
>> journal commit when the inode has changed" (see
>> http://kerneltrap.org/mailarchive/linux-kernel/2008/2/26/990504 ).  The
>> way files are preallocated, the PostgreSQL WAL is supposed to function
>> just fine even if you're using fdatasync after WAL writes, which also
>> wouldn't touch the journal (last time I checked fdatasync was
>> implemented as a full fsync on Linux).  Since the new ext4 is more
> 
> Indeed it does.
> 
> I wonder if there should be an optional fsync mode
> in postgres should turn fsync() into
>     fchmod (fd, 0644); fchmod (fd, 0664);
> to work around this issue.

Question is... why do you care if the journal is not flushed on fsync?
Only the file data blocks need to be, if the inode is unchanged.

> For example this program below will show one write
> per disk revolution if you leave the fchmod() in there,
> and run many times faster (i.e. lying) if you remove it.
> This with ext3 on a standard IDE drive with the write
> cache enabled, and no LVM or anything between them.
> 
> ==========================================================
> /*
> ** based on http://article.gmane.org/gmane.linux.file-systems/21373
> ** http://thread.gmane.org/gmane.linux.kernel/646040
> */
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <stdlib.h>
> 
> int main(int argc,char *argv[]) {
>   if (argc<2) {
>     printf("usage: fs <filename>\n");
>     exit(1);
>   }
>   int fd = open (argv[1], O_RDWR | O_CREAT | O_TRUNC, 0666);
>   int i;
>   for (i=0;i<100;i++) {
>     char byte;
>     pwrite (fd, &byte, 1, 0);
>     fchmod (fd, 0644); fchmod (fd, 0664);
>     fsync (fd);
>   }
> }
> ==========================================================
> 

I ran the program above, w/o the fchmod()s.

$ time ./test2 testfile

real    0m0.056s
user    0m0.001s
sys     0m0.008s

This is with ext3+LVM+raid1+sata disks with hdparm -W1.
With -W0 I get:

$ time ./test2 testfile

real    0m1.014s
user    0m0.000s
sys     0m0.008s

Big difference. The fsync() there does its job.

The same program runs with a x3 slowdown with the fsyncs, but that's
expected, it's doing twice the writes, and in different places.

.TM.

-
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux