Search Postgresql Archives

Re: pg_restore enhancements

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks, the issue we've run into, which I guess could be really a setup issue, with running a COPY command while executing pg_restore, 
is that if we are restoring a large table (bigger than 500GB) our WAL directory can grow to be very large.

I would think that if the pg_restore or COPY command was able to support a batch-size option, 
this should allow postgres to either archive or remove wal files and prevent having to re-size the WAL directory for a one time refresh operation.

I'm trying to gage how feasible would be to start looking at contributing to add such a feature to either the COPY command or pg_restore.

Thanks,
Efrain J. Berdecia


On Wednesday, November 22, 2023 at 11:37:13 AM EST, Adrian Klaver <adrian.klaver@xxxxxxxxxxx> wrote:


On 11/22/23 05:25, Efrain J. Berdecia wrote:
> After working for a site where we are constantly doing logical pg_dump
> to refresh environments I've come to miss features available in other
> RDBMS' refresh/restore utilities.
>
> Someone could point me in the right direction otherwise, but pg_restore
> seems to be lacking the ability to resume a restore upon failure, is all
> or nothing with this guy. There also doesn't seem to be a way to control
> batch size when doing the COPY phase, therefore preventing the WAL
> directory from filling up and crashing the system.

The above needs more information on Postgres version(community or fork),
OS and version, the size of the data set, the storage type and size, the
Postgres conf, etc. Restores are being done all the time and this is the
first report, as far as I can remember,  about an issue with COPY and
WAL in a restore.


pg_restore
https://www.postgresql.org/docs/current/app-pgrestore.html

Does have:

--section=sectionname

    Only restore the named section. The section name can be pre-data,
data, or post-data. This option can be specified more than once to
select multiple sections. The default is to restore all sections.

    The data section contains actual table data as well as large-object
definitions. Post-data items consist of definitions of indexes,
triggers, rules and constraints other than validated check constraints.
Pre-data items consist of all other data definition items.


AND

-l
--list

    List the table of contents of the archive. The output of this
operation can be used as input to the -L option. Note that if filtering
switches such as -n or -t are used with -l, they will restrict the items
listed.


-L list-file
--use-list=list-file

    Restore only those archive elements that are listed in list-file,
and restore them in the order they appear in the file. Note that if
filtering switches such as -n or -t are used with -L, they will further
restrict the items restored.

    list-file is normally created by editing the output of a previous
-l operation. Lines can be moved or removed, and can also be commented
out by placing a semicolon (;) at the start of the line. See below for
examples.


>
> IMHO, it would be nice to have a feature that would allow pg_restore to
> resume based on which part of the restore and/or object have already
> been restored.
>
> When it comes to the COPY phase of the restore, it would be nice to be
> able to control batch size and resume COPY of a particular object upon
> failure.

COPY as it stands now is all or none, so that command would have to be
changed.


>
> Thanks in advance for any suggestions or the green light to post this to
> the PG-developer group :-)
>
> Thanks,
> Efrain J. Berdecia


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux