Can lots of small writes badly hamper reads from other tables?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi folks

This could be a sheer volume issue, but I though I would ask the wisdom of this forum as to next investigative steps.

----

We use PostgreSQL 8.4.4 which is bundled with our application as a VMware virtual appliance. The bulk of the app's database activity is recording performance data points which arrive in farily large sustained bursts of perhaps 10,000 rows a minute at a medium sized customer, each of which are logically separate items and being committed as individual transactions (JDBC auto-commit mode). Our offshore QA team was assigned to track an intermittent issue with speed of some large queries on other tables, and they believe based on correlation the two activities may be contending.

The large query is coming off of different tables from the ones being written to ... the raw data goes into a table named by day (partitioning is all within the app, not PG) e.g. PERF_RAW_2012_01_24 and then there are a bunch of rollup statements which run hourly to do the aggregations, e.g.

insert into PERF_HOURLY_2012_01_24 select key_columns, avg(data), now() from perf_raw_2012_01_24 where time_stamp between (now() - interval '1 hour') and now() group by key_columns

The big queries are hitting multiple of the PERF_HOURLY tables and pulling a few dozen rows from each.

We are using a 64-bit VM with 8 virtual cores and 8GB RAM, of which Java takes a bit over half, and Linux XXXXX with CentOS 5.x  .... PG has 1GB of buffer cache and reasonable (AFAICT) resource limits for everything else, which are intended to be workable for a range of client sizes out of the box. True transactional consistency is disabled for performance reasons, virtual environments do not take kindly to lots of small writes.

---

Is there any tweaking we should do on the PG settings, or on the pattern in which the app is writing - we currently use 10 writer threads on the Java side and they keep PG going pretty good.

I considered bundling the writes into larger transactions, will that really help much with commit consistency off?

Is there some specific "usual suspect" stuff I should look at on the PG side to look for efficiency issues such as index lock contention or a poor buffer cache hit ratio? Will doing EXPLAIN ANALYSE on the big query be informative, and if so, does it need to be done while the write load is applied?

The other whacky idea I had was to have the writer threads pause or throttle themselves when a big query is happening (it's all in one JVM and we are using a connection pooler, so it's easy to intercept and track if needed) however that strikes me as a rather ugly hack and I'd prefer to do something more robust and based on config tweaks that leverage existing resource management in PG.

Relevant schema and config attached, all comments and advice welcome, including general tuning tips and rationale for moving to PG 9.x .... I'm well aware this isn't the acme of PG tuning :)

Cheers
Dave



Attachment: postgresql.conf
Description: Binary data

-- The current day's perf_raw table contains the extra "b" index
-- to facilitate certain queries including the rollup ones which
-- create the perf_hourly tables. This index is dropped once the
-- table is no longer being written to.

            Table "public.perf_raw_2011_12_31"
     Column      |            Type             | Modifiers 
-----------------+-----------------------------+-----------
 guid_key        | integer                     | 
 property_key    | integer                     | 
 instance_id_key | integer                     | 
 time_stamp      | timestamp without time zone | 
 value           | double precision            | 
Indexes:
    "perf_raw_2011_12_31_a" btree (guid_key, property_key, time_stamp)
    "perf_raw_2011_12_31_b" btree (time_stamp, property_key)

            Table "public.perf_raw_2011_12_29"
     Column      |            Type             | Modifiers 
-----------------+-----------------------------+-----------
 guid_key        | integer                     | 
 property_key    | integer                     | 
 instance_id_key | integer                     | 
 time_stamp      | timestamp without time zone | 
 value           | double precision            | 
Indexes:
    "perf_raw_2011_12_29_a" btree (guid_key, property_key, time_stamp)

           Table "public.perf_hourly_2011_12_29"
     Column      |            Type             | Modifiers 
-----------------+-----------------------------+-----------
 guid_key        | integer                     | 
 property_key    | integer                     | 
 instance_id_key | integer                     | 
 time_stamp      | timestamp without time zone | 
 value           | double precision            | 
Indexes:
    "perf_hourly_2011_12_29_a" btree (guid_key, property_key, time_stamp)

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux