Re: Catching up Production from Warm Standby after maintenance - Please help

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi -  Some answers, questions and comments below

> At my last job I had a head dev guy who's experience with pgsql was back in the 7.0 days or so, and his standard phrase was "vacuum in postgresql isn't fast enough."
Yeah, I don't want to turn into that guy. 
 
> So, what class machines are these, and specifically how much ram, what kind of RAID controllers, and how many hard drives are you throwing at the problem?
We have two identical enterprise Linux machines coming.  They are not here yet (hence my planning concerns).  Presently, we have a lesser but still good enterprise machine with 64gb of RAM, 8 CPUs and ~1 TB of space.  The new ones coming have 5 TB each of RAID disks mirrored across to another 5TB (20 5.4gb spinning disks).  Not sure the RAID level but I think it's level six.  They each have 16 CPUs, and (I believe) 128 Gb of RAM, connected to our NFS network on a 10g switch (at least I think it's the 10 gigabit switch - it might be one gigabit).

I wonder, what do you use to decide when to reindex?  Under this situation, it's very possible that you don't need to do it all that often.  Are you just flying 'willy-nilly' about reindexing things, or is there some indicator you use?
I am hoping to use system statistics for the table(s).  In theory, once the index size gets to be a larger logical fraction of the table size than it reasonably should be, it's time to consider re-indexing.  I thought to use some combination of pg_stat_user_indexes info and pg_statio_all_indexes, but I haven't had to do this yet and if you have suggestions, that'd be helpful.  I should be able to run a check of things once/week and cron it to email me the results with a warning if the numbers get too high. 
>So, you're allowing users to do adhoc DDL on a production system? No testing / staging environment beforehand?  You are asking for trouble
IMHO. OTOH, there are instances where well defined applications can run DDL.

We have developed a big set of C APIs that work with Postgres, and allow the users to create data tables based on a definition file. They can specify an index or three, and can
choose among a few data types.
The thing works remarkably well, much better than I would have thought at first blush. The "production" tables that are the big ones are designed carefully with DBA input.

>Define "maintenance". Maybe you're bringing along a few more bad habits from Sybase.
Oh, I am sure that's likely. Heh. I lock out the users, reindex things that need reindexing, based on use patterns and size, I update all histograms and statistics for user and system tables, I remove any dead rows possible
(hopefully all of them), and I make a nice clean tape set when I am done - before letting anyone back in. I often cycle the server (pg_ctl start/stop) as well. Sometimes we take that time to power cycle the whole machine
and do any machine maintenance we can after the nice clean backup is done.
> XID wrap-around is still an issue, but if you have autovacuum on and you pay attention to your logs, you'll be okay.
Having re-read the section on XID wraparound, and having the support of our manager to keep people from doing long-running transactions, I think we're okay here too.  I can always set a cron for the combing through the logs (which I need to do anyway), and I can keep a better eye on LRTs.  Do you prefer parsing through 'select * from pg_stat_activity;' for checking for long-running trans, or is there an easier way?

>You may want to look into constraint_exclusion partitioning
Thanks for the specifics on this tip.  I think I'll be able to use that since our data is largely time-based. 

> If you're running a version prior to 8.4, make sure your Free Space Manager settings are adequate. (A VACUUM VERBOSE at the database level will report where you're at in that regard in the last few lines.)
I show the following in postgresql.conf (this has not been tuned):
# - Free Space Map -
max_fsm_pages = 204800                  # min max_fsm_relations*16, 6 bytes each
                                        # (change requires restart)
#max_fsm_relations = 1000               # min 100, ~70 bytes each
                                        # (change requires restart)

I have a couple of weeks to consider the possibilty of upgrading to 8.4.  Since it's only been out for a short time, I doubt we'll upgrade before 8.4.2. 

> We tried it a few times early on and discovered that aggressive maintenance is unnecessary if you do a good job with your regular maintenance, and that if you have room for a second copy of a table, CLUSTER is almost always a better option than VACUUM FULL.

Hm - Kevin, when you issue CLUSTER on your table, when do you switch the users over and do you let them keep writing during the process?  Or is the second copy table read-only to the users while the cluster-build is happening on the primary (making it unavailable)?  Thanks - and thanks for the Sybase difference explanation.

Thanks again everyone for your time and your help.  I really appreciate it.
-Jennifer


Windows Live™: Keep your life in sync. Check it out.

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux