Re: OOM killer while pg_restore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Em qui., 3 de mar. de 2022 às 15:32, Marc Rechté <marc4@xxxxxxxxx> escreveu:
Le 03/03/2022 à 16:31, Tom Lane a écrit :
> =?UTF-8?Q?Marc_Recht=c3=a9?= <marc4@xxxxxxxxx> writes:
>> We have a pg_restore which fails due to RAM over-consumption of the
>> corresponding PG backend, which ends-up with OOM killer.
>> The table has one PK, one index, and 3 FK constraints, active while
>> restoring.
>> The dump contains over 200M rows for that table and is in custom format,
>> which corresponds to 37 GB of total relation size in the original DB.
> The FKs would result in queueing row trigger events, which would occupy
> some memory.  But those should only need ~12 bytes per FK per row,
> which works out to less than 10GB for this number of rows, so it may
> be that you've hit something else that we would consider a leak.
>
> Does memory consumption hold steady if you drop the FK constraints?
>
> If not, as others have noted, we'd need more info to investigate
> this.  The leak is probably independent of the specific data in
> the table, so maybe you could make a small self-contained example
> using a script to generate dummy data.
>
>                       regards, tom lane
>
>
Actually the number of rows is 232735712.

Accordingly the RAM consumption would be x12 x3 = 7.8 GiB.

This is close to the 8,1g I reported earlier (actually it was closer to
7.8 GB, due to GiB vs. GB confusion).

So there is no memory leak.

It took 16 hours on my box to reach that RAM consumption, and then the
COPY failed when checking the first FK (as the referenced table was empty).

I dropped the FK, index, and 3 FK constraints and started over the
pg_restore:

11 minutes to load the table (I did not have time to note RAM consumption)

I then created the PK and index:

24 minutes

For FK, I don't know because the referenced table are empty (but I'll be
able to test next week, if deemed necessary).

16 hours vs. 35 minutes to reach the same state.
Maybe it's out of reach, but one way to help Postgres developers fix this
is to provide Flame Graphs [1] based on these slow operations.
For confidentiality and privacy reasons, the data is out of reach.

My 2c here.

regards,
Ranier Vilela
[1] https://www.brendangregg.com/flamegraphs.html

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux