On 11/22/2016 11:41 PM, azhwkd@xxxxxxxxx wrote:
Greetings! I'm using postgres version 9.5.5 on a ubuntu 16.04.1 server installation which was installed through apt-get. I have a query which if run alone usually completes in about 300ms. When run in my application this query constantly locks up and bogs down all connections of the connection pool (In the application this query is run up to 10 times in parallel with different parameters). What's really weird is that I can re-run one of the hung queries from the command line while it's hung and it will complete as expected while the hung queries continue to use 100% CPU time.
Have you looked in the Postgres logs to see if there are any relevant messages?
Best guess is that running in parallel is creating a lock problem where one of the queries is holding a lock that is keeping the others from completing.
What relationships do the group_history and group_history_* tables have to other tables?
What is the definition of the trigger that fires the trigger function below?
The query in question is this: insert into group_history ("group", id, sub_category, "date", aa, ab, bb, ba, quantity, "hour") (select a."group", a.id, b.sub_category, to_timestamp($2)::date as "date", max(a.aa / a.quantity) as aa, min(a.aa / a.quantity) as ab, max(a.bb / a.quantity) as bb, min(a.bb/ a.quantity) as ba, sum(a.quantity) as quantity, extract('hour' from to_timestamp($2)) as "hour" from tbla a join tblb b on a.id = b.id where a."group" = $1 and b."group" = $1 group by a."group", a.id, b.sub_category ); When I'm running a perf on the system it looks like this while running the query 10 times: Samples: 4M of event 'cpu-clock', Event count (approx.): 18972107951 Overhead Shared Object Symbol 17.95% postgres [.] heap_hot_search_buffer 5.64% postgres [.] heap_page_prune_opt 4.62% postgres [.] hash_search_with_hash_value 3.80% postgres [.] LWLockRelease 3.73% postgres [.] 0x00000000002f420d 2.50% postgres [.] _bt_checkkeys 2.48% postgres [.] hash_any 2.45% postgres [.] 0x00000000002f41e7 2.10% postgres [.] slot_getattr 1.80% postgres [.] ResourceOwnerForgetBuffer 1.58% postgres [.] LWLockAcquire 1.58% postgres [.] ReadBufferExtended 1.54% postgres [.] index_fetch_heap 1.47% postgres [.] MemoryContextReset 1.43% postgres [.] btgettuple 1.38% postgres [.] 0x00000000002d710c 1.36% postgres [.] 0x00000000002d70a5 1.35% postgres [.] ExecQual Explain (Analyze, Verbose) Output QUERY PLAN ---------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.group_history (cost=10254.36..10315.16 rows=2432 width=62) (actual time=1833.967..1833.967 rows=0 loops=1) -> Subquery Scan on "*SELECT*" (cost=10254.36..10315.16 rows=2432 width=62) (actual time=353.880..376.490 rows=6139 loops=1) Output: "*SELECT*"."group", "*SELECT*".id, "*SELECT*".sub_category, "*SELECT*"."when", "*SELECT*".aa, "*SELECT*".ab, "*SELECT*".bb, "*SELECT*".ba, "*SELECT*".quantity, "*SELECT*"."hour" -> HashAggregate (cost=10254.36..10278.68 rows=2432 width=28) (actual time=353.871..367.144 rows=6139 loops=1) Output: a."group", a.id, b.sub_category, '2016-11-20'::date, max((a.aa / a.quantity)), min((a.aa / a.quantity)), max((a.bb / a.quantity)), min((a.bb / a.quantity)), sum(a.quantity), '21'::double precision Group Key: a."group", a.id, b.sub_category -> Hash Join (cost=5558.64..10181.40 rows=2432 width=28) (actual time=193.949..294.106 rows=30343 loops=1) Output: a."group", a.id, a.aa, a.quantity, a.bb, b.sub_category Hash Cond: (b.id = a.id) -> Bitmap Heap Scan on public.auctions_extra b (cost=685.19..4719.06 rows=30550 width=8) (actual time=56.678..111.038 rows=30343 loops=1) Output: b.sub_category, b.id Recheck Cond: (b."group" = 7) Heap Blocks: exact=289 -> Bitmap Index Scan on auction_extra_pk (cost=0.00..677.55 rows=30550 width=0) (actual time=55.966..55.966 rows=30343 loops=1) Index Cond: (b."group" = 7) -> Hash (cost=4280.62..4280.62 rows=30627 width=28) (actual time=137.160..137.160 rows=30343 loops=1) Output: a."group", a.id, a.aa, a.quantity, a.bb, a.id Buckets: 16384 Batches: 4 Memory Usage: 638kB -> Bitmap Heap Scan on public.tbla a (cost=689.78..4280.62 rows=30627 width=28) (actual time=58.530..117.064 rows=30343 loops=1) Output: a."group", a.id, a.aa, a.quantity, a.bb, a.id Recheck Cond: (a."group" = 7) Heap Blocks: exact=254 -> Bitmap Index Scan on tbla_pk (cost=0.00..682.12 rows=30627 width=0) (actual time=57.801..57.801 rows=30343 loops=1) Index Cond: (a."group" = 7) Planning time: 0.475 ms Trigger group_history_trigger: time=1442.561 calls=6139 Execution time: 1834.119 ms group_history_trigger: CREATE OR REPLACE FUNCTION public.group_history_partition_function() RETURNS trigger LANGUAGE plpgsql AS $function$ declare _new_date timestamptz; _tablename text; _startdate text; begin -- Takes the current inbound "when" value and determines when midnight is for the given date _new_date := date_trunc('day', new."when"); _startdate := to_char(_new_date, 'YYYY_MM_DD'); _tablename := 'group_history_'||_startdate; -- Insert the current record into the correct partition execute 'INSERT INTO public.' || quote_ident(_tablename) || ' VALUES ($1.*) on conflict ("group", id, sub_category, "when", "hour") do update set aa = excluded.aa, ab = excluded.ab, bb = excluded.bb, ba = excluded.ba, quantity = excluded.quantity' using new; return null; end $function$ Has anyone experienced this before or even has a solution for this? I'd be very grateful. Kind regards Sebastian
-- Adrian Klaver adrian.klaver@xxxxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general