Re: unlogged tables

Jim Nasby <Jim.Nasby@xxxxxxxxxxxxxx> · Mon, 13 Apr 2015 21:45:21 -0500

On 4/13/15 7:32 PM, David G. Johnston wrote:
        The missing feature is an option to leaved restored the last
        checkpoint.  Instead, not knowing whether there were changes
        since the last checkpoint, the system truncated the relation.

        What use case is there for a behavior that the last checkpoint
        data is left on the relation upon restarting - not knowing
        whether it was possible the other data could have been written
        subsequent?

    I would like a way to have unlogged tables be available on a replica
    provided that no changes were made to them between the pg_basebackup
    and the recovery point.

    My use case is that I mark certain read-only-after-bulk-loading
    tables as unlogged solely to avoid blowing out the log archive
    during the loading phase and refresh phase.  This is stuff like
    vendor catalogs, NCBI datasets, ChEMBL datasets, etc, which can
    simply be re-derived from the reference.  It would be nice if these
    were still available (without having to repeat the ETL) after
    crashes provided they were not written to since a checkpoint, and
    available on cloned test servers without having to repeat the ETL on
    those as well.

My gut reaction is that those should be in their own clusters and
accessed via postgres_fdw...

Likely to produce really crappy plans if the tables are of any real size...

That particular use-case would probably best be served with a separate
replication channel which pushes data files from the primary to the
slaves and allows for the slave to basically "rewrite" its existing
table by pointing to the newly supplied version.  Some kind of "CREATE
STATIC TABLE" and "PUSH STATIC TABLE TO {all | replica name}" command
combo...though ideally with less manual intervention...

You still have the same problem of knowing if someone has scribbled on 
the data since the last checkpoint.

There's been recent discussion of adding support for read-only tables. 
If we had those, we might be able to support something like...

INSERT INTO unlogged;
ALTER TABLE unlogged SET READ ONLY;
CHECKPOINT;
/* take backup */

This should be safe as long as we WAL log changes to read-only status 
(which presumably we would).

How much work that would entail though, I don't know.

Ultimately you still have to get the data over to the other machine 
anyway. ISTM it'd be a LOT more useful to look at ways to make the WAL 
logging of bulk inserts (and especially COPY into a known empty table) a 
lot more efficient.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance