Re: I don't want to back up index files

Glen Parker <glenebob@xxxxxxxxxx> · Thu, 12 Mar 2009 16:53:53 -0700

Tom Lane wrote:
Glen Parker <glenebob@xxxxxxxxxx> writes:
Mainly because the idea doesn't seem to make sense unless that's part
of the package.  If you don't cut index changes out of the WAL load
then the savings on the base backup alone aren't going to be all that
exciting when you consider the total cost of PITR backup.

In our setting, I think it might be more exciting than you think.  As I 
said, I've not noticed any real impact to the system related to WAL 
exporting, but the nightly backup does indeed have a significant impact 
because of how long it runs.  WAL export is a couple seconds ever few 
minutes, which nobody ever notices.  The backup runs for a minimum of an 
hour and fifteen minutes, which people definitely notice.

Furthermore, you would need some very ugly hacks on the recovery process
to make it ignore (rather than try to apply) WAL records relating to
indexes.  I believe there are a fair number of cases where the recovery
process doesn't even know that a particular file is an index, because
the WAL stream doesn't tell it.  The live backends generating the WAL
log entries typically know that (and could suppress the entries) but the
recovery process has only a very limited view of reality.  It cannot,
for example, trust the system catalogs to be in a correct/consistent
state, so it couldn't look up the info for itself.

Could the live backends label the log entries with "hints" to be used by 
the replay process?  In this case, I would think a simple flag 
indicating whether replay is critical or not would suffice.

BTW, there's a related problem with the idea, which is that the
tools normally used to take base backups haven't got any way to
distinguish indexes from any other kind of relation.

Yes there's no doubt it would increase the complexity of the base 
backup, IF a person chooses to ignore indexes.  The up side is that 
people who are happy with the backup as it is would have to do nothing 
at all, it would just continue to work as it does now.  To ignore 
indexes (and only certain indexes at that), you'd have to examine the 
system catalog as part of each backup.  I already do that to some 
extent, in order to discover all the extra tablespaces that need to be 
backed up.

I guess the biggest problem I see with this is that it would have rather 
a small target audience.

-Glen

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general