Re: Can lots of small writes badly hamper reads from other tables?

"Tomas Vondra" <tv@xxxxxxxx> · Tue, 24 Jan 2012 22:09:06 +0100

On 24 Leden 2012, 21:16, Dave Crooke wrote:
> Hi folks
>
> This could be a sheer volume issue, but I though I would ask the wisdom of
> this forum as to next investigative steps.
>
> ----
>
> We use PostgreSQL 8.4.4 which is bundled with our application as a VMware
> virtual appliance. The bulk of the app's database activity is recording
> performance data points which arrive in farily large sustained bursts of
> perhaps 10,000 rows a minute at a medium sized customer, each of which are
> logically separate items and being committed as individual transactions
> (JDBC auto-commit mode). Our offshore QA team was assigned to track an
> intermittent issue with speed of some large queries on other tables, and
> they believe based on correlation the two activities may be contending.
>
> The large query is coming off of different tables from the ones being
> written to ... the raw data goes into a table named by day (partitioning
> is
> all within the app, not PG) e.g. PERF_RAW_2012_01_24 and then there are a
> bunch of rollup statements which run hourly to do the aggregations, e.g.

Each storage device has some basic I/O limits - sequential speed (read/write)
and the maximum number or I/O operations it can handle. For example a 7.2k
drives can do up to 160MB/s sequential reads/writes, but not more than 120
I/O ops per second. Similarly for other devices - 15k drives can do up to
250 I/Os. SSDs can handle much more I/Os, e.g. Intel 320 can handle about
8k I/Os.

I have no idea what kind of storage device you're using and what amount of
sequential and random operations it can handle. But my guess you're hitting
the limit of random I/Os - each commit requires a fsync, and you're doing
10.000 of them per minute, i.e. about 160 per second. If the queries need
to read data from the drive (e.g. randomly), this just adds more I/Os.

> Is there any tweaking we should do on the PG settings, or on the pattern
> in
> which the app is writing - we currently use 10 writer threads on the Java
> side and they keep PG going pretty good.

The first thing you should do is grouping the inserts to one transaction.
That'll lower the number of I/Os the database needs to do. Besides that,
you can move the WAL to a separate (physical) device, thus spreading the
I/Os to more drives.

> I considered bundling the writes into larger transactions, will that
> really
> help much with commit consistency off?

What do you mean by "commit consistency off"?

> Is there some specific "usual suspect" stuff I should look at on the PG
> side to look for efficiency issues such as index lock contention or a poor
> buffer cache hit ratio? Will doing EXPLAIN ANALYSE on the big query be
> informative, and if so, does it need to be done while the write load is
> applied?

The first thing you should do is gathering some basic I/O stats.

Run  pg_test_fsync (a contrib module) to see how many fsync operations the
I/O subsystem can handle (if it reports more than 500, use "-o" to get it
running for a longer time).

Then gather "vmstat 1" and "iostat -x 1" for a few seconds when the workload
(inserts and queries) are actually running. That should tell you how the
drives are actually utilized.

Post these results to this list.

> Relevant schema and config attached, all comments and advice welcome,
> including general tuning tips and rationale for moving to PG 9.x .... I'm
> well aware this isn't the acme of PG tuning :)

There's a nice page about tuning at the wiki:

  http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

I'd recommend significantly increasing the number of checkpoint segments,
e.g. to 64 (1GB) and setting completion target to 0.9. This usually helps
write-heavy workloads. And enable log_checkpoints.

Tomas

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance