Hi On Thursday 07 November 2013 06:40:00 Kevin Grittner wrote: > Robert Voinea <rvoinea@xxxxxxxxx> wrote: > > From time to time (1-2 months) one of the production systems I > > manage starts acting crazy... and this is starting to become a > > problem. > > > > Every query I send to the server has a very long running time > > (sometimes it reaches 45+ seconds). Even the simples queries like > > "SELECT NOW()" run in 4-5 seconds... or more. > > > > This goes on for a few minutes. After that, everything is back to > > normal. > > A likely cause for this is transparent huge page defrag. If you > run `vmstat 1` during an episode and see a lot of system CPU time, > that tends to confirm this. If you run `perf top` during an > episode and see the kernel spending a lot of time in spinlock > functions, that pretty much nails it. You might want to turn off > transparent huge page support. I need to read about this first. I forgot to mention that this is a 32 bit system... and due to the known limitations we are forced to use PAE on the kernel... Unfortunately we cannot use a 64 bit operating system. > > I'm guessing that, somehow, there is heavy I/O usage at that time > > Which might be another possibility if the `vmstat 1` shows a lot of > I/O wait time. Another colleague of mine managed to somehow reproduce the issue. It seems that the disk + network interface could be the bottle neck (iostat & iotop shows average wait times very high) and the peak throughput of the disk at 70MB/s. This is not the actual production system, but a similar one (as hardware). We'll try to re-do the tests with the clients and the server connected via a Gigabit network and see if this helps. > > I've seen that postgres 9.1.8 Changelog contains some bug fixes > > related to performance of autovaccum: > If there is high I/O and you see autovacuum running wraparound > prevention vacuums or vacuuming tables which have recently had a > large number of rows deleted, you might solve the problem by > updating. I would strongly recommend updating anyway, because > otherwise you are running with known bugs and security > vulnerabilities, including one very serious one. > > Why knowingly put yourself through dealing with bugs that others > have already encountered, reported, diagnosed, and fixed? I agree with you on this, but the update requires downtime and has to be scheduled in advance. Thank you for replying. I will investigate this even further and see what I may come up with. I'll keep you informed with what I find. -- Robert Voinea Software Engineer +4 0740 467 262 Don't take life too seriously. You'll never get out of it alive. (Elbert Hubbard) -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin