Hi Lars, I do not know if this makes sense in PostgreSQL and that readers do not block writers and writes do not block readers. Are your UPDATEs to individual rows, each in a separate transaction, or do you UPDATE multiple rows in the same transaction? If you perform multiple updates in a single transaction, you are synchronizing the changes to that set of rows and that constraint is causing other readers that need to get the correct values post- transaction to wait until the COMMIT completes. This means that the WAL write must be completed. Have you tried disabling synchronous_commit? If this scenario holds, you should be able to reduce the slowdown by un-batching your UPDATEs, as counter-intuitive as that is. This seems to be similar to a problem that I have been looking at with using PostgreSQL as the backend to a Bayesian engine. I am following this thread with interest. Regards, Ken On Thu, Jul 07, 2011 at 04:56:13PM -0700, lars wrote: > I am doing some research that will hopefully lead to replacing a big > Oracle installation with a set PostgreSQL servers. > > The current Oracle installations consists of multiple of RAC > clusters with 8 RAC nodes each. Each RAC node has 256gb of > memory (to be doubled soon). > The nature of our service is such that over a reasonable time (a day > or so) the database *is* the working set. > > So I am looking at Postgres in a context where (almost) all of the > data is cached and disk IO is only required for persistence. > > Setup: > PostgreSQL 9.1beta2 on a high memory (~68gb, 12 cores) EC2 Linux > instance (kernel 2.6.35) with the database and > WAL residing on the same EBS volume with EXT4 (data=ordered, > barriers=1) - yes that is not an ideal setup > (WAL should be on separate drive, EBS is slow to begin, etc), but I > am mostly interested in read performance for a fully cached > database. > > shared_buffers: varied between 1gb and 20gb > checkpoint_segments/timeout: varied accordingly between 16-256 and > 5-10m, resp. > bgwriter tweaked to get a good distribution of checkpoints, > bg-writes, and backend writes. > wal_sync_method: tried fdatasync and open_datasync. > > I read "PostgreSQL 9.0 high performance", and have spent some > significant amount of time on this already. > > PostgreSQL holds up extremely well, once things like "storing > hint-bits", checkpoints vs bgwriter vs backend_writes, etc > are understood. I installed pg_buffercache and pgfincore to monitor > how and where the database is stored. > > There is one observation that I wasn't able to explain: > A SELECT only client is severely slowed down by a concurrent client > performing UPDATES on the same table the other > client selects from, even when the database resides 100% in the > cache (I tried with shared_buffers large enough to hold > the database, and also with a smaller setting relying on the OS > cache, the behavior is the same). > > As long as only the reader is running I get great performance > (20-30ms, query reading a random set of about 10000 rows > out of 100m row table in a single SELECT). The backend is close to > 100% cpu, which is what want in a cached database. > > Once the writer starts the read performance drops almost immediately > to >200ms. > The reading backend's cpu drop drop to <10%, and is mostly waiting > (D state in top). > The UPDATE touches a random set of also about 10000 rows (in one > update statement, one of the columns touched is > indexed - and that is the same index used for the SELECTs). > > What I would have expected is that the SELECTs would just continue > to read from the cached buffers (whether dirtied > or not) and not be affected by concurrent updates. I could not find > anything explaining this. > > The most interesting part: > that this does not happen with an exact clone of that relation but > UNLOGGED. The same amount of buffers get dirty, > the same amount checkpointing, bgwriting, vacuuming. The only > difference is WAL maintenance as far as I can tell. > > Is there some (intentional or not) synchronization between backend > when the WAL is maintained? Are there times when > read only query needs to compete disk IO when everything is cached? > Or are there any other explanations? > > I am happy to provide more information. Although I am mainly looking > for a qualitative answer, which could explain this behavior. > > Thanks. > > -- Lars > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance