Re: Enhance pg_dump multi-threaded streaming (WAS: Re: filesystem full during vacuum - space recovery issues)

Doug Reynolds <mav@xxxxxxxxxxxxx> · Fri, 19 Jul 2024 21:21:50 +0000

Thomas—

Why are you using logical backups for a database this large?  A solution like PgBackRest?  Obviously, if you are going to upgrade, but for operational use, that seems to be a slow choice.

Doug

On Jul 19, 2024, at 4:26 PM, Thomas Simpson <ts@xxxxxxxxxxxxxx> wrote:

    Hi Scott

    On 19-Jul-2024 15:34, Scott Ribe wrote:

        On Jul 19, 2024, at 7:46 AM, Thomas Simpson <ts@xxxxxxxxxxxxxx> wrote:

I realize some of the background was snipped on what I sent to the hacker list, I'll try to fill in the details.

      I was gone from my computer for a day and lost track of the thread.

Perhaps logical replication could help you out here?

    I'm not sure - perhaps, but at this point, I've got that
      dump/reload running and provided it completes ok (in about 20 days
      time at current rate), I'll be fine with this.
    The database itself is essentially an archive of data so is no
      longer being added to at this point, so it's an annoyance for the
      rebuild time rather than a disaster.
    [But incidentally, I am working on an even larger project which
      is likely to make this one seem small, so improvement around large
      databases is important to me.]

    However, my thought is around how to avoid this issue in the
      future and to improve the experience for others faced with the
      dump-reload which is always the fall-back upgrade suggestion
      between versions.
    Getting parallelism should be possible and the current pg_dump
      does that for directory mode from what I can see - making multiple
      threads etc.  according to parallel.c in pg_dump, it even looks
      like most of where my thought process was going is actually
      already there.
    The extension should be adding synchronization/checkpointing
      between the generating dump and the receiving reload to ensure
      objects are not processed until all their requirements are already
      present in the new database.  This is all based around routing via
      network streams instead of the filesystem as currently happens.
    Perhaps this is already in place since the restore can be done in
      parallel, so must need to implement that ordering already?  If
      someone with a good understanding of dump is able to comment or
      even give suggestions, I'm not against making an attempt to
      implement something as a first attempt.
    I see Tom Lane from git blame did a bunch of work around the
      parallel dump back in 2020 - perhaps he could make suggestions
      either via private direct email or the list ?

    Thanks
    Tom