Tom Lane wrote:
Glen Parker <glenebob@xxxxxxxxxx> writes: Mainly because the idea doesn't seem to make sense unless that's part of the package. If you don't cut index changes out of the WAL load then the savings on the base backup alone aren't going to be all that exciting when you consider the total cost of PITR backup.
In our setting, I think it might be more exciting than you think. As I said, I've not noticed any real impact to the system related to WAL exporting, but the nightly backup does indeed have a significant impact because of how long it runs. WAL export is a couple seconds ever few minutes, which nobody ever notices. The backup runs for a minimum of an hour and fifteen minutes, which people definitely notice.
Furthermore, you would need some very ugly hacks on the recovery process to make it ignore (rather than try to apply) WAL records relating to indexes. I believe there are a fair number of cases where the recovery process doesn't even know that a particular file is an index, because the WAL stream doesn't tell it. The live backends generating the WAL log entries typically know that (and could suppress the entries) but the recovery process has only a very limited view of reality. It cannot, for example, trust the system catalogs to be in a correct/consistent state, so it couldn't look up the info for itself.
Could the live backends label the log entries with "hints" to be used by the replay process? In this case, I would think a simple flag indicating whether replay is critical or not would suffice.
BTW, there's a related problem with the idea, which is that the tools normally used to take base backups haven't got any way to distinguish indexes from any other kind of relation.
Yes there's no doubt it would increase the complexity of the base backup, IF a person chooses to ignore indexes. The up side is that people who are happy with the backup as it is would have to do nothing at all, it would just continue to work as it does now. To ignore indexes (and only certain indexes at that), you'd have to examine the system catalog as part of each backup. I already do that to some extent, in order to discover all the extra tablespaces that need to be backed up.
I guess the biggest problem I see with this is that it would have rather a small target audience.
-Glen -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general