Re: Modification of data in base folder and very large tables

Jerry Sievers <gsievers19@xxxxxxxxxxx> · Wed, 09 Oct 2019 18:05:20 -0500

Ogden Brash <info@xxxxxxxxxx> writes:

> I have a question about the files in .../data/postgresql/11/main/
> base, specifically in relation to very large tables and how they are
> written.
>
> I have been attempting to restore a relatively large database with
> pg_restore and it has been running for more than a week. (I also 
> have another thread about the same restore related to network vs.
> local disk I/O)
>
> I ran the pg_restore in verbose mode using multiple jobs so I can
> tell what has finished and what has not. The database had 66 tables
> and most of them have been restored. Two of the tables were quite
> large (billions of rows translating to between 1 and 2TB of data on
> disk for those two tables) and those two tables are pretty much the
> only things remaining that has not been reported as finished by
> pg_restore.
>
> As the process has been going for a week, I have been tracking the
> machine (a dedicated piece of hardware, non-virtualized) and have
> been noticing a progressive slowdown (as tracked by iostat). There is

Do the tables that are being loaded have any indexes on them?

> nothing running on the machine besides postgresql and the server is
> only doing this restore, nothing else. It is now, on average, running
> at less than 25% of the speed that it was running four days ago (as
> measured by rate of I/O). 
>
> I started to dig into what was happening on the machine and I noticed
> the following:
>
> iotop reports that two postgres processes (I assume each processing
> one of the two tables that needs to be processed) are doing all the I
> /O. That makes sense
>
> Total DISK READ :    1473.81 K/s | Total DISK WRITE :     617.30 K/s
> Actual DISK READ:    1473.81 K/s | Actual DISK WRITE:       0.00 B/s
>   TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>   
> COMMAND
>  6601 be/4 postgres  586.44 K/s    7.72 K/s  0.00 % 97.39 % postgres:
> 11/main: postg~s thebruteff [local] COPY
>  6600 be/4 postgres  887.37 K/s  601.87 K/s  0.00 % 93.42 % postgres:
> 11/main: postg~s thebruteff [local] COPY
>   666 be/3 root        0.00 B/s    7.72 K/s  0.00 %  5.73 % [jbd2/
> sda1-8]
>     1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
>     2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 %
> [kthreadd]
>     4 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker/
> 0:0H]
>
> So, the next thing I though I would do is an "lsof" command for each
> of those two processes to see what they were writing. That was a bit
> of a surpise:
>
> # lsof -p 6600 | wc -l;
> 840
>
> # lsof -p 6601 | wc -l;
> 906
>
> Is that normal? That there be so many open file pointers? ~900 open
> file pointers for each of the processes?
>
> The next I did was go to see the actual data files, to see how many
> there are. In my case they are in postgresql/11/main/base/24576 and
> there are 2076 files there. That made sense. However, I found that
> when I list them by modification date I see something interesting:
>
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.7
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.8
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.9
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.10
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.11
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.12
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.13
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.14
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.16
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.15
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.17
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.18
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.19
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.21
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.22
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.23
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.24
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.25
> -rw------- 1 postgres postgres 1073741824 Oct  8 13:05 27083.26
> -rw------- 1 postgres postgres   19062784 Oct  8 13:05 27082_fsm
> -rw------- 1 postgres postgres  544489472 Oct  8 13:05 27077.34
> -rw------- 1 postgres postgres  169705472 Oct  8 13:05 27082.72
> -rw------- 1 postgres postgres  978321408 Oct  8 13:05 27083.27
> -rw------- 1 postgres postgres  342925312 Oct  8 13:05 27076.88
>
> If you notice, the file size is capped at 1 GB and as the giant table
> has grown it has added more files in this directory. However, the
> mysterious thing to me is that it keeps modifying those files
> constantly - even the ones that are completely full. So for the two
> large tables it has been restoring all week, the modification time
> for the ever growing list of files is being updating constantly.
>
> Could it be that thats why I am seeing a slowdown over the course of
> the week - that for some reason as the number of files for the table
> has grown, the system is spending more and more time seeking around
> the disk to touch all those files for some reason?
>
> Does anyone who understands the details of postgresql's interaction
> with the file system have an explanation for why all those files
> which are full are being touched constantly?
>
>
>
>

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consulting@xxxxxxxxxxx