Re: When do vacuumed pages/tuples become available for reuse?

rihad <rihad@xxxxxxx> · Thu, 11 Apr 2019 23:04:40 +0400



    On 04/11/2019 10:13 PM, Jeff Janes
      wrote:

    
        On Thu, Apr 11, 2019 at 12:18 PM rihad <rihad@xxxxxxx>
          wrote:

        
              On
                04/11/2019 08:09 PM, Jeff Janes wrote:

              
                    On Thu, Apr 11, 2019 at 11:44 AM
                      rihad <rihad@xxxxxxx>
                      wrote:

                    
                          Since we dump&restore production DB
                            daily into staging environment, the
                            difference in size (as reported by psql's
                            \l+) is 11GB in a freshly restored DB as
                            opposed to 70GB in production.

                          
                      Yeah, that seems like a problem.  Do you have
                        long lived transactions/snapshots that are
                        preventing vacuuming from removing dead tuples? 
                        You can run a manual "vacuum verbose" and see
                        how many dead but nonremovable tuples there
                        were, or set log_autovacuum_min_duration to some
                        non-negative value less than the autovac takes,
                        and do the same.
                      

              vacuum frees tuples just fine. It's just that by the time
              each run finishes many more accumulate due to table update
              activity, ad nauseum. So this unused space constantly
              grows. Here's a sample autovacuum run:

              
              2019-04-11 19:39:44.450841500 [] LOG:  automatic vacuum of
              table "foo.public.bar": index scans: 1

              2019-04-11 19:39:44.450843500   pages: 0 removed, 472095
              remain, 4 skipped due to pins, 39075 skipped frozen

              2019-04-11 19:39:44.450844500   tuples: 19150 removed,
              2725811 remain, 465 are dead but not yet removable

              2019-04-11 19:39:44.450845500   buffer usage: 62407557
              hits, 6984769 misses, 116409 dirtied

              2019-04-11 19:39:44.450846500   avg read rate: 16.263
              MB/s, avg write rate: 0.271 MB/s

              2019-04-11 19:39:44.450847500   system usage: CPU
              59.05s/115.26u sec elapsed 3355.28 sec

            
          This data doesn't seem to support either one of our
            theories.  "Dead but not yet removable" is low.  But
            "removed" also seems pretty low.  Is 19,150 really the
            number of updates you think occur over the course of an hour
            which causes the problem you are seeing?  Updates that
            happened during one vacuum should be cleanly caught by the
            next one, so you should only see a steady state of bloat,
            not unbounded increase.
          

          But your buffer usage being 132 time the number of pages
            in the table suggests it is your indexes, not your table,
            which are bloated.
           
          How many indexes do you have, and of what type?  Index
            pages can only get reused when they become completely empty,
            or when a new indexed value fits into (or near) the
            key-space that that page already covers.  So if the key
            space for new tuples is constantly migrating around and your
            pages never become absolutely empty, you can get unbounded
            bloat in the indexes.
        
      
    Way to many indexes. I'm going to have a hard time convincing our
    programmers to get rid of any of them )

    
          Can you compare the sizes object by object between the
            live and the stage, taking care not to include index (or
            toast) size into the size of their parent table?
        
      
    You're right, it's mostly indexes that are bloated.

    
    Staging:

    # select pg_size_pretty(pg_relation_size('foo'));

     pg_size_pretty 

    ----------------

     2924 MB

    (1 row)

    
    # select pg_size_pretty(pg_indexes_size('foo'));

     pg_size_pretty 

    ----------------

     1958 MB

    (1 row)

    
    Prod:

    
    # select pg_size_pretty(pg_relation_size('foo'));

     pg_size_pretty 

    ----------------

     3688 MB

    (1 row)

    
    # select pg_size_pretty(pg_indexes_size('foo'));

     pg_size_pretty 

    ----------------

     60 GB

    (1 row)

    
                      Also, what does pg_freespace (https://www.postgresql.org/docs/current/pgfreespacemap.html)
                        show about the available of space in the table? 
                        How about pgstattuple (https://www.postgresql.org/docs/current/pgstattuple.html)
                    
                  
              Thanks, I'll try those. But as I said freshly restored DB
              is only 11GB in size, not 70 (only public schema is used).

            
          Yeah, but we need to know **why** that extra 59GB  is not
            being reused, not simply the fact that it isn't being
            reused.  If it isn't listed as free in the freespace map,
            then PostgreSQL might not know how to find it in order to
            reuse it, for example. But now that I think it is the
            indexes, not the table, that is bloated I would chase that
            part down first.  No point checking the freespace of the
            table proper if the problem is with the indexes.
           
          Cheers,
          

          Jeff