Re: Sudden insert performance degradation

Sebastian Dressler <sebastian@xxxxxxxxxxx> · Wed, 15 Jul 2020 08:03:46 +0000

Hi Henrique,

On 15. Jul 2020, at 03:13, Henrique Montenegro <typoon@xxxxxxxxx> wrote:

[...]

```

ssl = off

shared_buffers = 160GB                  # min 128kB

work_mem = 96GB                         # min 64kB

maintenance_work_mem = 12GB             # min 1MB

max_stack_depth = 4MB                   # min 100kB

dynamic_shared_memory_type = posix      # the default is the first option

synchronous_commit = off                # synchronization level;

commit_delay = 100000                   # range 0-100000, in microseconds

max_wal_size = 3GB

min_wal_size = 1GB

min_parallel_index_scan_size = 64kB

effective_cache_size = 96GB

log_min_messages = debug1 # values in order of decreasing detail:

log_checkpoints = on

log_error_verbosity = verbose # terse, default, or verbose messages

log_line_prefix = '%m [%p] %q%u@%d '            # special values:

log_lock_waits = on                     # log lock waits >= deadlock_timeout

log_timezone = 'America/New_York'

log_executor_stats = on

datestyle = 'iso, mdy'

```

[...]

  Limit  |  Time (seconds)

---------|------------------

10       | 0.6

100      | 0.6

1000     | 1.3

10000    | 116.9

100000   | 134.8

1000000  | 193.2

Notice the jump in time execution from a 1k limit to a 10k limit. Amount of

data raised 10x and execution time raised 100x.

It seems to me that inserting the data in this case is slow because the time

it takes to identify the duplicate records (which I assume would be done in a

fashion similiar to the queries above) is taking a long time.

I have attached the `explain analyze` output for the 1k and 10k queries to

this email (they are 4k+ lines each, didn't want to make this messager bigger

than it already is).

* exp1k.txt

* exp10k.txt

[...]

I quickly glanced at the exp10k plan and there are some things I noticed (sorry for not going over all the mail, have to re-read it again):

- There are a lot of partitions now, you maybe want consider reducing the amount. To me it seems that you overload the system. Scan times are low but the overhead to start a scan is likely quite high.
- work_mem = 96GB seems very high to me, I guess you'd be better with e.g. 4GB as a start but many more parallel workers. For instance, depending on your machine, try adjusting the max_worker_processes, max_parallel_workers and max_parallel_workers_per_gather.
 Values depend a bit on your system, make sure, that max_parallel_workers_per_gather are much lower than max_parallel_workers and that must be lower than max_worker_processes. You can try large values, for instance 128, 120, 12.
- You may want to test with min_parallel_table_scan_size = 0
- Did you enable partition pruning, partitionwise join and aggregate?

Thanks,
Sebastian

--

Sebastian Dressler, Solution Architect 

+49 30 994 0496 72 | sebastian@xxxxxxxxxxx 

Swarm64 AS

Parkveien 41 B | 0258 Oslo | Norway

Registered at Brønnøysundregistrene in Norway under Org.-Number 911 662 787

CEO/Geschäftsführer (Daglig Leder): Thomas Richter; Chairman/Vorsitzender (Styrets Leder): Dr. Sverre Munck 

Swarm64 AS Zweigstelle Hive

Ullsteinstr. 120 | 12109 Berlin | Germany

Registered at Amtsgericht Charlottenburg - HRB 154382 B