Re: Maintenance question / DB size anomaly...

Richard Huxton <dev@xxxxxxxxxxxx> · Tue, 19 Jun 2007 18:04:50 +0100

Kurt Overberg wrote:
Richard,

Thanks for the feedback!  I found oid2name and have been mucking about 
with it, but haven't really found anything that stands out yet.  Most of 
the tables/indexes I'm comparing across machines seem to take up a 
similar amount of disk space.  I think I'm going to have to get fancy 
and write some shell scripts.  Regarding the slony configuration 
scripts, you're assuming that I have such scripts.  Our slony install 
was originally installed by a contractor, and modified since then so 
"getting my act together with respect to slony" is kinda beyond the 
scope of what I'm trying to accomplish with this maintenance.  I really 
just want to figure out whats going on with db1, and want to do so in a 
way that won't ruin slony since right now it runs pretty well, and I 
doubt I'd be able to fix it if it seriously broke.

Upon a cursory pass with oid2name, it seems that my  sl_log_1_idx1 index 
is out of hand:

If the sl_log_1 table is large too, it'll be worth reading throught the 
FAQ to see if any of its notes apply.

http://cbbrowne.com/info/faq.html

-bash-3.00$ oid2name -d mydb -f 955960160
 From database "mydb":
   Filenode     Table Name
--------------------------
  955960160  sl_log_1_idx1

-bash-3.00$ ls -al 955960160*
-rw-------  1 postgres postgres 1073741824 Jun 19 11:08 955960160
-rw-------  1 postgres postgres 1073741824 Jun 13  2006 955960160.1
-rw-------  1 postgres postgres  909844480 Jun 19 10:47 955960160.10
-rw-------  1 postgres postgres 1073741824 Jul 31  2006 955960160.2
-rw-------  1 postgres postgres 1073741824 Sep 12  2006 955960160.3
-rw-------  1 postgres postgres 1073741824 Oct 19  2006 955960160.4
-rw-------  1 postgres postgres 1073741824 Nov 27  2006 955960160.5
-rw-------  1 postgres postgres 1073741824 Feb  3 12:57 955960160.6
-rw-------  1 postgres postgres 1073741824 Mar  2 11:57 955960160.7
-rw-------  1 postgres postgres 1073741824 Mar 29 09:46 955960160.8
-rw-------  1 postgres postgres 1073741824 Mar 29 09:46 955960160.9

I know that slony runs its vacuuming in the background, but it doesn't 
seem to be cleaning this stuff up.  Interestingly, from my VACUUM 
pgfouine output,
that index doesn't take that long at all to vacuum analyze (compared to 
my other, much larger tables).  Am I making the OID->filename 
translation properly?

Looks OK to me

Running this:
SELECT relname, relpages FROM pg_class ORDER BY relpages DESC;
...gives me...

sl_log_1_idx1                                                   |  1421785
xrefmembergroup                                                 |  1023460
answerselectinstance                                            |   565343

...does this jibe with what I'm seeing above?  I guess I'll run a full 
vacuum on the slony tables too?  I figured something would else would 
jump out bigger than this.  FWIW, the same table on db2 and db3 is very 
small, like zero.  I guess this is looking like it is overhead from 
slony?  Should I take this problem over to the slony group?

Well, pages are 8KB each (by default), so that'd be about 10.8GB, which 
seems to match your filesizes above.

Read through the FAQ I linked to - for some reason Slony's not clearing 
out transactions it's replicated to your slaves (they *are* in sync, 
aren't they?). Could be a transaction preventing vacuuming, or perhaps a 
 partially dropped node?

Check the size of the sl_log_1 table and see if that tallies.

--
  Richard Huxton
  Archonet Ltd