In progress INSERT wrecks plans on table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Recently we encountered the following unhappy sequence of events:

1/ system running happily
2/ batch load into table begins
3/ very quickly (some) preexisting queries on said table go orders of magnitude slower
4/ database instance becomes unresponsive
5/ application outage

After looking down a few false leads, We've isolated the cause to the following:

The accumulating in-progress row changes are such that previously optimal plans are optimal no longer. Now this situation will fix itself when the next autoanalyze happens (and new plan will be chosen) - however that cannot occur until the batch load is completed and committed (approx 70 seconds). However during that time there is enough of a performance degradation for queries still using the old plan to cripple the server.

Now that we know what is happening we can work around it. But I'm wondering - is there any way (or if not should there be one) to let postgres handle this automatically? I experimented with a quick hack to src/backend/commands/analyze.c (attached) that lets another session's ANALYZE see in progress rows - which works but a) may cause other problems and b) does not help autoaanalyze which has to wait for COMMIT + stats message.

I've attached a (synthetic) test case that shows the issue, I'll reproduce the output below to hopefully make the point obvious:


               Table "public.plan"
 Column |            Type             | Modifiers
--------+-----------------------------+-----------
 id     | integer                     | not null
 typ    | integer                     | not null
 dat    | timestamp without time zone |
 val    | text                        | not null
Indexes:
    "plan_id" UNIQUE, btree (id)
    "plan_dat" btree (dat)
    "plan_typ" btree (typ)


[Session 1]
EXPLAIN ANALYZE
SELECT * FROM plan
WHERE typ = 3 AND dat IS NOT NULL;
                                                     QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Index Scan using plan_dat on plan (cost=0.00..265.47 rows=55 width=117) (actual time=0.130..4.409 rows=75 loops=1)
   Index Cond: (dat IS NOT NULL)
   Filter: (typ = 3)
   Rows Removed by Filter: 5960
 Total runtime: 4.440 ms
(5 rows)

[Session 2]

BEGIN;
INSERT INTO plan
SELECT id + 2000001,typ,current_date + id * '1 seconds'::interval ,val
FROM plan
;

[Session 1]
EXPLAIN ANALYZE
SELECT * FROM plan
WHERE typ = 3 AND dat IS NOT NULL;

                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Index Scan using plan_dat on plan (cost=0.00..551.35 rows=91 width=117) (actual time=0.131..202.699 rows=75 loops=1)
   Index Cond: (dat IS NOT NULL)
   Filter: (typ = 3)
   Rows Removed by Filter: 5960
 Total runtime: 202.729 ms
(5 rows)
[Session 2]
COMMIT;

[Session 1...wait for autoanalyze to finish then]

EXPLAIN ANALYZE
SELECT * FROM plan
WHERE typ = 3 AND dat IS NOT NULL;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on plan (cost=407.87..44991.95 rows=10116 width=117) (actual time=2.692..6.582 rows=75 loops=1)
   Recheck Cond: (typ = 3)
   Filter: (dat IS NOT NULL)
   Rows Removed by Filter: 19925
-> Bitmap Index Scan on plan_typ (cost=0.00..405.34 rows=20346 width=0) (actual time=2.573..2.573 rows=20000 loops=1)
         Index Cond: (typ = 3)
 Total runtime: 6.615 ms


Regards

Mark

Attachment: plan.tar.gz
Description: application/gzip

*** analyze.c.orig	2013-04-26 10:40:06.634942283 +1200
--- analyze.c	2013-04-26 11:36:13.537404101 +1200
***************
*** 1173,1183 ****
  					 * has to adjust the numbers we send to the stats
  					 * collector to make this come out right.)
  					 */
! 					if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(targtuple.t_data)))
! 					{
! 						sample_it = true;
! 						liverows += 1;
! 					}
  					break;
  
  				case HEAPTUPLE_DELETE_IN_PROGRESS:
--- 1173,1181 ----
  					 * has to adjust the numbers we send to the stats
  					 * collector to make this come out right.)
  					 */
! 					/* Amend so insert in progress tuples are counted */
! 					sample_it = true;
! 					liverows += 1;
  					break;
  
  				case HEAPTUPLE_DELETE_IN_PROGRESS:
-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux