Re: DB Performance decreases due to often written/accessed

Richard Huxton <dev@xxxxxxxxxxxx> · Thu, 19 Oct 2006 18:19:16 +0100

Jens Schipkowski wrote:
And running vacuum every 30 seconds does what?
Not yet fully tested. It seems to lower the slow down.
But minimizing the gain of slow down doesn't solve the problem. The 
Problem is the decrease of execution speed of DELETE and SELECT 
statements by a table row count between 150k - 200k. The table starts 
growing first after DELETE statements fails during timeout.

It does not sound professional, but the database seems to be aging by 
the increase of executed statements.

It sounds very likely if you aren't vacuuming enough or the tables are 
growing rapidly.
vacuuming once a minute is not enough? We reach the execution of 3k 
statements per minute (startup time of testing). 1/4 of them are INSERTs 
and DELETEs. After 5 minutes a DELETE will took about 50 seconds - 
compared to startup time about 300 - 600ms.

You want to vacuum enough so that the deletes don't leave 
ever-increasing gaps in your table. If you run VACUUM VERBOSE it will 
tell you what it did - if the number of (non-)removable rows keeps 
increasing you're not vacuuming more. The trick is to vacuum often but 
not have to do a lot of work in each. The autovacuum tool in recent 
versions tries to estimate this for you, but might not cope in your case.

The Statements
---------------
// normal insert - does not seem to be the problem - runtime is ok
INSERT INTO 
tbl_reg(idreg,idtype,attr1,...,attr6,loc1,...,loc3,register) 
VALUES(nextval('tbl_reg_idreg_seq'),1,[attr],[loc],1);
 // select finds out which one has not an twin
// a twin is defined as record with the same attr* values
// decreases speed over time until timeout by postgresql
SELECT *
FROM tbl_reg reg
WHERE register <> loc1 AND
    idreg NOT IN
        (
        SELECT reg.idreg
        FROM tbl_reg reg, tbl_reg regtwin
        WHERE regtwin.register = 1 AND
            regtwin.type <> 20 AND
            reg.attr1 = regtwin.attr1 AND
            reg.attr2 = regtwin.attr2 AND
            reg.attr3 = regtwin.attr3 AND
            reg.attr4 = regtwin.attr4 AND
            reg.attr5 = regtwin.attr5 AND
            reg.attr6 = regtwin.attr6 AND
            reg.idreg <> regtwin.idreg AND
            reg.register = 2
        );
I tried to optimize the seslect statement but the group by having 
count(*)
1 solution is half as fast as this statement - relating to the query
plan of EXPLAIN ANALYZE.

And what did EXPLAIN ANALYSE show here? I'm guessing you're getting 
time increasing as the square of the number of rows in tbl_reg. So, if 
50 rows takes 2500s then 100 rows will take 10000s. Now, if you had 
enough RAM (or an index) I'd expect the planner to process the table 
in a sorted manner so the query-time would increase linearly.

EXPLAIN ANALYZE shows at startup
   QUERY PLAN
------------------------------------------------------------------------------------------------- 

 Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6) 
(actual time=673.877..673.877 rows=0 loops=1)
   Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 = 
2::smallint)))
   SubPlan
     ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8) 
(actual time=285.624..285.624 rows=0 loops=1)
           Merge Cond: (("outer".attr1 = "inner".attr1) AND 
("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" = 
"inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?") 
AND ("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 = 
"inner".attr6))
           Join Filter: ("outer".idreg <> "inner".idreg)
           ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual 
time=285.622..285.622 rows=0 loops=1)
                 Sort Key: reg.attr1, (reg.attr2)::text, 
(reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
                 ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59 
rows=1835 width=56) (actual time=285.551..285.551 rows=0 loops=1)
                       Filter: (register = 2)
           ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56) 
(never executed)
                 Sort Key: regtwin.attr1, (regtwin.attr2)::text, 
(regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text, 
regtwin.attr6
                 ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70 
rows=140247 width=56) (never executed)
                       Filter: ((register = 1) AND ("type" <> 20))
 Total runtime: 604.410 ms
(15 rows)

EXPLAIN ANALYZE shows after 10 minutes load and 1x vacuuming
      QUERY PLAN
---------------------------------------------------------------------------------------------------- 

 Seq Scan on tbl_reg  (cost=25841.35..31433.17 rows=72424 width=6) 
(actual time=43261.910..43261.910 rows=0 loops=1)
   Filter: ((hashed subplan) OR ((loc1 = 2::smallint) AND (loc2 = 
2::smallint)))
   SubPlan
     ->  Merge Join  (cost=22186.21..25646.57 rows=77913 width=8) 
(actual time=43132.063..43132.063 rows=0 loops=1)
           Merge Cond: (("outer".attr1 = "inner".attr1) AND 
("outer"."?column8?" = "inner"."?column8?") AND ("outer"."?column9?" = 
"inner"."?column9?") AND ("outer"."?column10?" = "inner"."?column10?") 
AND ("outer"."?column11?" = "inner"."?column11?") AND ("outer".attr6 = 
"inner".attr6))
           Join Filter: ("outer".idreg <> "inner".idreg)
           ->  Sort  (cost=4967.06..4971.65 rows=1835 width=56) (actual 
time=387.071..387.872 rows=1552 loops=1)
                 Sort Key: reg.attr1, (reg.attr2)::text, 
(reg.attr3)::text, (reg.attr4)::text, (reg.attr5)::text, reg.attr6
                 ->  Seq Scan on tbl_reg reg  (cost=0.00..4867.59 
rows=1835 width=56) (actual time=303.966..325.614 rows=1552 loops=1)
                       Filter: (register = 2)
           ->  Sort  (cost=17219.15..17569.77 rows=140247 width=56) 
(actual time=42368.934..42530.986 rows=145324 loops=1)
                 Sort Key: regtwin.attr1, (regtwin.attr2)::text, 
(regtwin.attr3)::text, (regtwin.attr4)::text, (regtwin.attr5)::text, 
regtwin.attr6
                 ->  Seq Scan on tbl_reg regtwin  (cost=0.00..5229.70 
rows=140247 width=56) (actual time=0.015..1159.515 rows=145453 loops=1)
                       Filter: ((register = 1) AND ("type" <> 20))
 Total runtime: 44073.127 ms
(15 rows)

OK - these plans look about the same, but the time is greatly different. 
Both have rows=140247 as the estimated number of rows in tbl_reg. Either 
 you have many more rows in the second case (in which case you're not 
running ANALYSE enough) or you have lots of gaps in the table (you're 
not running VACUUM enough).

I'd then try putting an index on (attr1,attr2,attr3...attr6) and see if 
that helps reduce time.

--
  Richard Huxton
  Archonet Ltd