On Tue, 17 Mar 2009, Ron Mayer wrote:
I wonder if there should be an optional fsync mode
in postgres should turn fsync() into
fchmod (fd, 0644); fchmod (fd, 0664);
to work around this issue.
The test I haven't had time to run yet is to turn the bug exposing program
you were fiddling with into a more accurate representation of WAL
activity, to see if that chmod still changes the behavior there. I think
the most dangerous possibility here is if you create a new WAL segment and
immediately fill it, all in less than a second. Basically, what
XLogFileInit does:
-Open with O_RDWR | O_CREAT | O_EXCL
-Write XLogSegSize (16MB) worth of zeros
-fsync
Followed by simulating what XLogWrite would do if you fed it enough data
to force a segment change:
-Write a new 16MB worth of data
-fsync
If you did all that in under a second, would you still get a filesystem
flush each time? From the description of the problem I'm not so sure
anymore. I think that's how tight the window would have to be for this
issue to show up right now, you'd only be exposed if you filled a new WAL
segment faster than the associated journal commit happened (basically, a
crash when WAL write volume >16MB/s in a situation where new segments are
being created). But from what I've read about ext4 I think that window
for mayhem might widen on that filesystem--that's what got me reading up
on this whole subject recently, before this thread even started.
The other ameliorating factor here is that in order for this to bite you,
I think you'd need to have another, incorrectly ordered write somewhere
else that could happen before the delayed write. Not sure where that
might be possible in the PostgreSQL WAL implementation yet.
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general