Re: Fwd: temp_file_limit?

Frits Jalvingh <jal@xxxxxx> · Mon, 19 Dec 2022 20:29:39 +0100

Ok, just to make sure that I understand correctly:The parallel hash implementation needs to resize its table because of a mismatch in expected tuple count. I do expect this to be true: Postgres often grossly underestimates the expected row counts in our queries.
This is not fully implemented yet: removing the "old "files is not yet done, so every time the table resizes it creates a new set of files and the old ones remain.
I assume that the "used file size" only includes the "current" set of files, and that the old ones are not counted towards that amount? That would explain why it overallocates, of course.

By itself I now know what to do: I just need to disable all parallelism ( •̀ᴗ•́ )و ̑̑

I usually do that anyway because it makes queries die randomly. This is just another reason.

I restarted that query with max_parallel_workers_per_gather=0, and this does not seem to use tempspace at all. It was not exactly fast, it took 82 minutes of a single process running at 100% cpu. https://explain.depesz.com/s/HedE

Thanks a lot for your help Justin, and all others that chimed in too.

Next round is to try to get that query to run in the 5 seconds that we know it can do, reliably.

On Mon, Dec 19, 2022 at 6:46 PM Justin Pryzby <pryzby@xxxxxxxxxxxxx> wrote:
On Mon, Dec 19, 2022 at 06:27:57PM +0100, Frits Jalvingh wrote:

> I have listed the files during that run,

> 213M -rw------- 1 postgres postgres 213M dec 19 17:46 i100of128.p0.0

> 207M -rw------- 1 postgres postgres 207M dec 19 17:46 i100of128.p1.0

> 210M -rw------- 1 postgres postgres 210M dec 19 17:49 i100of256.p0.0

> 211M -rw------- 1 postgres postgres 211M dec 19 17:49 i100of256.p1.0

> 188M -rw------- 1 postgres postgres 188M dec 19 17:53 i100of512.p0.0

[...]

I think that proves Thomas' theory.  I'm not sure how that helps you,

though...

On Mon, Dec 19, 2022 at 01:51:33PM +1300, Thomas Munro wrote:

> One possibility is that you've hit a case that needs several rounds of

> repartitioning (because of a failure to estimate the number of tuples

> well), but we can't see that because you didn't show EXPLAIN (ANALYZE)

> output (understandably if it runs out of disk space before

> completing...).  The parallel hash code doesn't free up the previous

> generations' temporary files; it really only needs two generations'

> worth concurrently (the one it's reading from and the one it's writing

> to).  In rare cases where more generations are needed it could unlink

> the older ones -- that hasn't been implemented yet.  If you set

> log_temp_files = 0 to log temporary file names, it should be clear if

> it's going through multiple rounds of repartitioning, from the names

> (...of32..., ...of64..., ...of128..., ...of256..., ...).

-- 

Justin Pryzby

System Administrator

Telsasoft

+1-952-707-8581