On 26 Listopad 2011, 10:45, Gaëtan Allart wrote: > A better view of iotop : > > TID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND > 31875 be/4 postgres 0.00 B/s 15.23 M/s 0.00 % 0.00 % postgres: > database database 46.105.104.205(50228) SELECT > 30985 be/4 postgres 0.00 B/s 10.55 M/s 0.00 % 0.00 % postgres: > database database 46.105.104.205(47672) SELECT > > As you can see, SELECTS are writing very much on the disk. > > At this moment, pg_stat_activity show many many queries running at the > same time. > It looks like that all queries are suddenly writing on disk and not a > particular one, making me think of a buffer issue or something. No, I don't think this is an issue with the size of shared buffers. That'd influence the pg_stat_bgwriter - the buffers_backend would grow much faster, and it's not the case. So I'm guessing it's either hint bits or sorting. If I had to, I'd probably guess about the hint bits - see for example this thread http://archives.postgresql.org/pgsql-performance/2008-05/msg00182.php it more or less behaves like what you described so far. > This is a example of 10 MB/s writing query : > > SELECT COUNT(*) FROM (SELECT DISTINCT "table"."id" AS "id", > "table"."flux_id" AS "flux_id", "table"."locale_id" AS "locale_id", > "table"."url_article" AS "url_article", "table"."original_url" AS > "original_url", "table"."name" AS "name", "table"."description" AS > "description", "table"."content" AS "content", "table"."permis" AS > "permis", "table"."reviewed" AS "reviewed", "table"."author_id" AS > "author_id", "table"."poster_id" AS "poster_id", "table"."post_date" AS > "post_date", "table"."edit_date" AS "edit_date", "table"."add_date" AS > "add_date", "table"."comments_open" AS "comments_open", "table"."site_id" > AS "site_id", "table"."is_local" AS "is_local", "table"."status" AS > "status", "table"."visits" AS "visits", "table"."votes" AS "votes", > "table"."score" AS "score", "arti Post EXPLAIN ANALYZE output of this query (use explain.depesz.com to post it). > Checkpoints logs still show very long write times : > > LOG: checkpoint complete: wrote 92 buffers (0.0%); 0 transaction log > file(s) added, 0 removed, 1 recycled; write=49.622 s, sync=6.510 s, > total=63.625 s No, that's fine - that's what spread checkpoints do. Once the checkpoint_timeout expires, the system decides to perform a checkpoint, i.e. it has to write all dirty (modified) shared buffers to the disk. But it knows the next checkpoint will happen in checkpoint_timeout, so it has about 5 minutes to write all the data. So it says something like 'I need to write 540MB want to write that in 270 seconds (5 minutes * completion_target), so I'll write at 2MB/s'. So the write phase is expected to take long. But it has to sync the data at the end, and that's where the problems usually happen - so the important thing is 'sync' and that improved significantly. 6 seconds is not great but it's not a big issue I guess. > Couldn't this be a hardware issue ? I don't think so. The problem is that the queries write a lot of data, and the SSD can't help in this case. The SSDs provide much better random performance (compared to spinners), but the sequential performance is not much better. So pushing 70MB/s to the SSD may be fully utilized. Have you tested the basic performance (like bonnie++, dd etc) before going into production? There are some common SSD-related mistakes (e.g. not aligning the partitions properly) that may easily cause 30% performance loss. But this won't fix the problem, although you could get a bit better performance. Tomas -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general