Better understanding checkpoint logs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all. Trying to asses our WAL checkpoint during heavy ETL loads. Checkpoint timer is at 5min and it seems to be constantly checkpointing, maybe that's OK. We see an increased % of buffer percentages in the logs during the checkpoint. Here's a run:

| 2023-06-21 12:20:02.000 | 2023-06-21 12:20:02 UTC::@:[396]:LOG:  checkpoint complete: wrote 88523 buffers (4.4%); 0 WAL file(s) added, 0 removed, 29 recycled; write=268.929 s, sync=0.288 s, total=269.395 s; sync files=177, longest=0.178 s, average=0.002 s; distance=2368871 kB, estimate=4720836 kB |
| 2023-06-21 12:15:32.000 | 2023-06-21 12:15:32 UTC::@:[396]:LOG:  checkpoint starting: time |
| 2023-06-21 12:15:02.000 | 2023-06-21 12:15:02 UTC::@:[396]:LOG:  checkpoint complete: wrote 63434 buffers (3.1%); 0 WAL file(s) added, 0 removed, 43 recycled; write=269.279 s, sync=0.676 s, total=270.448 s; sync files=220, longest=0.487 s, average=0.004 s; distance=2830955 kB, estimate=4982166 kB |
| 2023-06-21 12:10:32.000 | 2023-06-21 12:10:32 UTC::@:[396]:LOG:  checkpoint starting: time |
| 2023-06-21 12:10:02.000 | 2023-06-21 12:10:02 UTC::@:[396]:LOG:  checkpoint complete: wrote 42229 buffers (2.1%); 0 WAL file(s) added, 0 removed, 45 recycled; write=269.272 s, sync=0.167 s, total=269.642 s; sync files=224, longest=0.131 s, average=0.001 s; distance=2923597 kB, estimate=5221189 kB |
| 2023-06-21 12:05:32.000 | 2023-06-21 12:05:32 UTC::@:[396]:LOG:  checkpoint starting: time |
| 2023-06-21 12:05:03.000 | 2023-06-21 12:05:03 UTC::@:[396]:LOG:  checkpoint complete: wrote 249580 buffers (12.4%); 0 WAL file(s) added, 0 removed, 79 recycled; write=269.868 s, sync=1.333 s, total=271.434 s; sync files=256, longest=0.375 s, average=0.006 s; distance=5209546 kB, estimate=5476477 kB |
| 2023-06-21 12:00:32.000 | 2023-06-21 12:00:32 UTC::@:[396]:LOG:  checkpoint starting: time |
| 2023-06-21 11:59:45.000 | 2023-06-21 11:59:45 UTC::@:[396]:LOG:  checkpoint complete: wrote 357329 buffers (17.7%); 0 WAL file(s) added, 1 removed, 83 recycled; write=249.722 s, sync=2.006 s, total=252.201 s; sync files=137, longest=0.849 s, average=0.015 s; distance=5504554 kB, estimate=5506136 kB |
| 2023-06-21 11:55:32.000 | 2023-06-21 11:55:32 UTC::@:[396]:LOG:  checkpoint starting: wal |
| 2023-06-21 11:55:13.000 | 2023-06-21 11:55:13 UTC::@:[396]:LOG:  checkpoint complete: wrote 470167 buffers (23.3%); 0 WAL file(s) added, 0 removed, 84 recycled; write=250.553 s, sync=0.026 s, total=250.705 s; sync files=193, longest=0.006 s, average=0.001 s; distance=5498807 kB, estimate=5506312 kB |
| 2023-06-21 11:51:02.000 | 2023-06-21 11:51:02 UTC::@:[396]:LOG:  checkpoint starting: wal |
| 2023-06-21 11:50:44.000 | 2023-06-21 11:50:44 UTC::@:[396]:LOG:  checkpoint complete: wrote 437633 buffers (21.7%); 0 WAL file(s) added, 0 removed, 84 recycled; write=157.932 s, sync=0.005 s, total=158.886 s; sync files=197, longest=0.004 s, average=0.001 s; distance=5507145 kB, estimate=5507145 kB |
| 2023-06-21 11:48:05.000 | 2023-06-21 11:48:05 UTC::@:[396]:LOG:  checkpoint starting: wal |
| 2023-06-21 11:47:57.000 | 2023-06-21 11:47:57 UTC::@:[396]:LOG:  checkpoint complete: wrote 338877 buffers (16.8%); 0 WAL file(s) added, 0 removed, 84 recycled; write=240.248 s, sync=0.042 s, total=240.928 s; sync files=164, longest=0.020 s, average=0.001 s; distance=5453945 kB, estimate=5453945 kB |
| 2023-06-21 11:43:56.000 | 2023-06-21 11:43:56 UTC::@:[396]:LOG:  checkpoint starting: wal |
| 2023-06-21 11:43:32.000 | 2023-06-21 11:43:32 UTC::@:[396]:LOG:  checkpoint complete: wrote 712593 buffers (35.3%); 0 WAL file(s) added, 15 removed, 42 recycled; write=269.229 s, sync=0.004 s, total=269.362 s; sync files=69, longest=0.003 s, average=0.001 s; distance=3454836 kB, estimate=3625475 kB |
| 2023-06-21 11:39:02.000 | 2023-06-21 11:39:02 UTC::@:[396]:LOG:  checkpoint starting: time |

I don't want to read too much into it but I am trying to understand it better to consider any potential changes in max WAL sizes, or increasing our checkpoint timeouts if they are causing the ETL stuff to be slower than it should. Our wal_level is currently set to logical but we plan to change this to replica soon as it feels like overkill.

--

[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux