Re: Trouble with plan statistics for behaviour for query.

Trevor Campbell <tcampbell@xxxxxxxxxxxxx> · Fri, 01 Jun 2012 09:01:24 +1000



    On 01/06/12 08:55, Craig James wrote:
    

      On Thu, May 31, 2012 at 3:29 PM, Trevor
        Campbell <tcampbell@xxxxxxxxxxxxx>
        wrote:

        
           We are having trouble with a particular query being
              slow in a strange manner.

              
              The query is a join over two large tables that are
              suitably indexed.

              
              select CG.ID, CG.ISSUEID, CG.AUTHOR,
              CG.CREATED, CI.ID, CI.FIELDTYPE, CI.FIELD,
              CI.OLDVALUE, CI.OLDSTRING, CI.NEWVALUE, CI.NEWSTRING 

                 from PUBLIC.CHANGEGROUP CG inner join PUBLIC.CHANGEITEM
              CI on CG.ID = CI.GROUPID where
              CG.ISSUEID=? order by CG.CREATED asc, CI.ID asc

            
          This has an unbound variable '?' in it.

        
    These queries are being run from a java application using JDBC and
    when run the variable is bound to an long integer value.  While
    trying to investigate the problem, I have been just hard coding a
    value in the statement.

    
              For some tasks we run this particular query a very large
              number of times and it has a significant performance
              impact when it runs slowly.

              
              If we run ANALYSE over the CHANGEITEM table
              then the performance picks up by a factor of 5 or more. 
              The problem is that a day later the performance will have
              dropped back to its previously slow state.

              
              The reason this is so hard to understand is that the
              activity on this table is very low, with no updates and
              only a relatively small number of inserts each day, <
              0.1% of the table size.

              
              Explain output:

              Sort  (cost=86.90..86.93 rows=11 width=118) (actual
              time=0.086..0.087 rows=14 loops=1)

                Sort Key: cg.created, ci.id

                Sort Method: quicksort  Memory: 26kB

                ->  Nested Loop  (cost=0.00..86.71 rows=11 width=118)
              (actual time=0.022..0.061 rows=14 loops=1)

                      ->  Index Scan using chggroup_issue on
              changegroup cg  (cost=0.00..17.91 rows=8 width=33) (actual
              time=0.012..0.015 rows=7 loops=1)

                            Index Cond: (issueid = 81001::numeric)

                      ->  Index Scan using chgitem_chggrp on
              changeitem ci  (cost=0.00..8.58 rows=2 width=91) (actual
              time=0.005..0.005 rows=2 loops=7)

                            Index Cond: (groupid = cg.id)

              Total runtime: 0.116 ms

            
          What's the exact SQL you used to get this ... did you use a
          specific CG.ISSUEID to run your test?  If that's the case,
          this EXPLAIN ANALYZE won't be the same as the one generated
          for your actual application.

          
          Craig

          
              The explain output always seems the same even when the
              performance is poor, but I can't be sure of that.

              
              Overall it seems like PostgreSQL just forgets about the
              statistics it has gathered after a short while.

              
              Schema details:

              CREATE TABLE changegroup

              (

                id numeric(18,0) NOT NULL,

                issueid numeric(18,0),

                author character varying(255),

                created timestamp with time zone,

                CONSTRAINT pk_changegroup PRIMARY KEY (id )

              )

              WITH (

                OIDS=FALSE

              );

              CREATE INDEX chggroup_issue

                ON changegroup

                USING btree

                (issueid );

              
              CREATE TABLE changeitem

              (

                id numeric(18,0) NOT NULL,

                groupid numeric(18,0),

                fieldtype character varying(255),

                field character varying(255),

                oldvalue text,

                oldstring text,

                newvalue text,

                newstring text,

                CONSTRAINT pk_changeitem PRIMARY KEY (id )

              )

              WITH (

                OIDS=FALSE

              );

              
              CREATE INDEX chgitem_chggrp

                ON changeitem

                USING btree

                (groupid );

              
              CREATE INDEX chgitem_field

                ON changeitem

                USING btree

                (field COLLATE pg_catalog."default" );

              
              Table sizes

            changegroup  :  2,000,000

            rows

            changeitem    :  2,500,000  rows

            
            The changegroup table has on average about 4 rows per
            issueid value, which is the query parameter.

            
            We run autovacuum and autoanalyse, but as the activity in
            the table is low these are rarely if ever invoked on these
            tables.

            
            Environment.

            Testing using PostgreSQL 9.1.3 on x86_64-redhat-linux-gnu, 
            although this is a problem across a variety of postgres
            versions.