I'm frequently getting these errors in my console:
4/11/09 2:25:04 PM org.postgresql.postgres[192] ERROR: could not read directory "pg_xlog": Invalid argument
4/11/09 2:25:56 PM org.postgresql.postgres[192] ERROR: could not read directory "pg_xlog": Invalid argument
4/11/09 2:36:03 PM org.postgresql.postgres[192] ERROR: could not read directory "pg_xlog": Invalid argument
and rarely:
3/11/09 10:32:31 PM org.postgresql.postgres[217] ERROR: could not read directory "pg_clog": Invalid argument
It is clearly not failing all the time, as the pg_xlog file is full of files that keep being touched and updated. I have not experienced data loss (yet), but large queries are taking orders of magnitude longer than I would like.
System:
Mac Pro Quad Nahelem 2.93GHz, 16GB RAM running Snow Leopard OS X 10.6.1 in 64bit mode
Postgres 8.4.1 (Intel 64 bit) from http://www.kyngchaos.com/software:postgres
( I have also tried compiling from source - I have the same problems plus a few extra installation issues. The "official" postgresql binary from http://www.enterprisedb.com/ is not 64 bit)
The postgres data directory is on an SSD Raid 0 array. It can support around 10K random read I/O per second, or 5K random write I/Os, sustained, in other applications. pg_xlog and pg_clog are on the same SSD raid array as the postgres DB.
Under postgres it does several thousand I/Os per second for about 1-2 seconds, then drops back to only about 50 I/Os per second for about 10 seconds, before repeating the cycle. CPU is usually only a couple % occupied. The console often records an error message "pg_xlog": Invalid argument during those infrequent activity bursts.
I've looked at the source code in src/port/dirmod.c:
pgfnames(const char *path)
{
....
while ((file = readdir(dir)) != NULL)
{
....
errno = 0;
}
....
if (errno)
{
....
fprintf(stderr, _("could not read directory \"%s\": %s\n"),
path, strerror(errno));
....
}
So it seems that readdir is returning "Invalid argument" occasionally. But I do not understand how this error could possibly occur in this location.
I've searched for "pg_xlog": Invalid argument, and the only other mention I have found was on Linux running on a ram disk.
Could this be a race condition? Suggestions?
Stephen