Re: How to do faster DML

"Peter J. Holzer" <hjp-pgsql@xxxxxx> · Sun, 4 Feb 2024 12:05:40 +0100

On 2024-02-04 02:14:20 +0530, Lok P wrote:
> However , as we have ~5billion rows in the base table and out of that , we were
> expecting almost half i.e. ~2billion would be duplicates.

That's way more than I expected from your original description. And it
of course raises the question whether it's a good idea to just throw
away all that data or if you need to keep that in a normalized way.

> And you said, doing the inserts using the "order by CTID Offset"
> approach must cause one full sequential scan of the whole table for
> loading each chunk/10M of rows and that would take a long time I
> believe.
> 
> I am still trying to understand the other approach which you suggested. Not
> able to understand "you can select where index_col > last order by index_col 
> limit 10M," .
> However, to get the max ID value of the last 10M loaded rows in target, do you
> say that having an PK index created on that target table column(ID) will
> help,

Yes. Getting the maximum value from an index is a very fast operation.
You just have to traverse down the right edge of the tree (or you may
even be able to access the right-most leaf page directly).

>  and we can save the max (ID) value subsequently in another table to fetch
> and keep loading from the source table (as ID>Max_ID stored in temp table)?

Another table or a variable in a script (personally, if I need to do
something repeatedly, I usually write a script in the scripting language
I feel most comfortable in (which has been Python for the last 7 or 8
years, Perl before that) which gives you variables, loops, conditionals
and - above all - repeatability.

> OR
> Would it be better to do it in one shot only , but by setting a higher value of
> some parameters like "maintenance_work_mem" or "max_parallel_workers"?

Hard to say. Normally, processing in fewer. bigger chunks is faster. But
RAM is much faster than disk (even with SSDs), so it might be faster to
make work_mem as large as you can and then use a chunk size which just
fits inside work_mem is faster. Of course finding that sweet spot takes
experimentation, hence time, and it may make little sense to experiment
for 20 hours just to save 40 minutes.

        hp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@xxxxxx         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
Attachment:
signature.asc

Description: PGP signature