Search Postgresql Archives

pg_dump (COPY) hanging intermittently

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I am running into a strange issue with Postgres 10 when using pg_dump with the directory format and jobs parameter set it intermittently hangs.  Seems to occur less frequently the lower I set the jobs parameter but does happen eventually even when set to 2 (could not reproduce when jobs=1).  I've tested with Postgres 11 and no matter how high I set the jobs parameter it never hangs (tested all the way up to jobs=8).  Also with Postgres 10 and other formats I don't run into the issue.  Here are my server specs and software versions (OS and Postgres)

Windows Server 2016 DataCenter  10.0.14393
2 cores  4 logical processors
26GB memory

Postgres 11.4 (port 5433)
Postgres 10.9 (port 5432)

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
And here are the psql statements and powershell script to recreate the issue.

CREATE DATABASE pg_dump_hang;
\c pg_dump_hang
CREATE TABLE series1 AS SELECT i FROM generate_series(1,1000000) i;
CREATE TABLE series2 AS SELECT i FROM generate_series(1,1000000) i;
CREATE TABLE series3 AS SELECT i FROM generate_series(1,1000000) i;
CREATE TABLE series4 AS SELECT i FROM generate_series(1,1000000) i;
CREATE TABLE series5 AS SELECT i FROM generate_series(1,1000000) i;
CREATE TABLE series6 AS SELECT i FROM generate_series(1,1000000) i;
CREATE TABLE series7 AS SELECT i FROM generate_series(1,1000000) i;
CREATE TABLE series8 AS SELECT i FROM generate_series(1,1000000) i;


$env:PGPASSWORD='#######'
$env:PGUSER='postgres'
$env:PGPORT=5433
set-location "C:\program files\postgresql\11\bin"
$a = 1
DO
{
   remove-item -Recurse D:\dumphangtest
   .\pg_dump.exe --format=directory --jobs=4 --file=D:\dumphangtest pg_dump_hang
   $a++
   "Completed loop $a at -" + (date)
} While ($a -le 25)

$env:PGPASSWORD='#######'
$env:PGUSER='postgres'
$env:PGPORT=5432
set-location "c:\program files\postgresql\10\bin"
$a = 1
DO
{
   remove-item -Recurse D:\dumphangtest
   .\pg_dump.exe --format=directory --jobs=4 --file=D:\dumphangtest pg_dump_hang
   $a++
   "Completed loop $a at -" + (date)
} While ($a -le 25)


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
Here is the output from pg_stat_activity and the filesystem listing when the hang occurs.

--jobs=4
postgres=# select pid,state,state_change,query_start,wait_event,wait_event_type,substring(query,1,35) from pg_stat_activity where datname like 'pg_dump%';
 pid  |        state        |         state_change          |          query_start          | wait_event  | wait_event_type |              substring
------+---------------------+-------------------------------+-------------------------------+-------------+-----------------+-------------------------------------
 5324 | idle in transaction | 2019-06-27 15:29:13.03416+01  | 2019-06-27 15:29:13.03154+01  | ClientRead  | Client          | SELECT at.attname, (SELECT pg_catal
 4144 | idle in transaction | 2019-06-27 15:29:16.222315+01 | 2019-06-27 15:29:15.070106+01 | ClientRead  | Client          | COPY public.series6 (i) TO stdout;
 5576 | active              | 2019-06-27 15:29:13.642069+01 | 2019-06-27 15:29:13.642068+01 | ClientWrite | Client          | COPY public.series2 (i) TO stdout;
 3860 | idle in transaction | 2019-06-27 15:29:17.109549+01 | 2019-06-27 15:29:16.198842+01 | ClientRead  | Client          | COPY public.series8 (i) TO stdout;
 3120 | idle in transaction | 2019-06-27 15:29:16.350121+01 | 2019-06-27 15:29:15.089817+01 | ClientRead  | Client          | COPY public.series7 (i) TO stdout;
(5 rows)

D:\>dir dumphangtest
 Volume in drive D has no label.
 Volume Serial Number is C635-CDEF

 Directory of D:\dumphangtest

06/27/2019  10:29 AM    <DIR>          .
06/27/2019  10:29 AM    <DIR>          ..
06/27/2019  10:29 AM         2,114,899 2863.dat.gz
06/27/2019  10:29 AM           376,832 2864.dat.gz
06/27/2019  10:29 AM         2,114,899 2865.dat.gz
06/27/2019  10:29 AM         2,114,899 2866.dat.gz
06/27/2019  10:29 AM         2,114,899 2867.dat.gz
06/27/2019  10:29 AM         2,114,899 2868.dat.gz
06/27/2019  10:29 AM         2,114,899 2869.dat.gz
06/27/2019  10:29 AM         2,114,899 2870.dat.gz
06/27/2019  10:29 AM             8,452 toc.dat
               9 File(s)     15,189,577 bytes
               2 Dir(s)  102,021,439,488 bytes free

--jobs=2
postgres=# select pid,state,state_change,query_start,wait_event,wait_event_type,substring(query,1,35) from pg_stat_activity where datname like 'pg_dump%';
 pid  |        state        |         state_change          |          query_start          | wait_event  | wait_event_type |              substring
------+---------------------+-------------------------------+-------------------------------+-------------+-----------------+-------------------------------------
 3420 | idle in transaction | 2019-06-27 14:37:55.067559+01 | 2019-06-27 14:37:54.448198+01 | ClientRead  | Client          | COPY public.series8 (i) TO stdout;
 3180 | idle in transaction | 2019-06-27 14:37:49.537396+01 | 2019-06-27 14:37:49.53608+01  | ClientRead  | Client          | SELECT at.attname, (SELECT pg_catal
 5364 | active              | 2019-06-27 14:37:50.151002+01 | 2019-06-27 14:37:50.151001+01 | ClientWrite | Client          | COPY public.series2 (i) TO stdout;
(3 rows)

D:\>dir dumphangtest
 Volume in drive D has no label.
 Volume Serial Number is C635-CDEF

 Directory of D:\dumphangtest

06/27/2019  09:37 AM    <DIR>          .
06/27/2019  09:37 AM    <DIR>          ..
06/27/2019  09:37 AM         2,114,899 2863.dat.gz
06/27/2019  09:37 AM            81,920 2864.dat.gz
06/27/2019  09:37 AM         2,114,899 2865.dat.gz
06/27/2019  09:37 AM         2,114,899 2866.dat.gz
06/27/2019  09:37 AM         2,114,899 2867.dat.gz
06/27/2019  09:37 AM         2,114,899 2868.dat.gz
06/27/2019  09:37 AM         2,114,899 2869.dat.gz
06/27/2019  09:37 AM         2,114,899 2870.dat.gz
06/27/2019  09:37 AM             8,452 toc.dat
               9 File(s)     14,894,665 bytes
               2 Dir(s)  102,056,497,152 bytes free


----------------------------------------------------------------------------------------------------------

And here is the stack trace from the thread of the hung process.

ntoskrnl.exe!KiSwapContext+0x76
ntoskrnl.exe!KiSwapThread+0x17d
ntoskrnl.exe!KiCommitThreadWait+0x14f
ntoskrnl.exe!KeWaitForSingleObject+0x377
ntoskrnl.exe!KiSchedulerApc+0x231
ntoskrnl.exe!KiDeliverApc+0x22a
ntoskrnl.exe!KiSwapThread+0x364
ntoskrnl.exe!KiCommitThreadWait+0x14f
ntoskrnl.exe!KeWaitForMultipleObjects+0x1fe
ntoskrnl.exe!ObWaitForMultipleObjects+0x2c7
ntoskrnl.exe!NtWaitForMultipleObjects+0xf9
ntoskrnl.exe!KiSystemServiceCopyEnd+0x13
ntdll.dll!NtWaitForMultipleObjects+0x14
KERNELBASE.dll!WaitForMultipleObjectsEx+0xef
KERNELBASE.dll!WaitForMultipleObjects+0xe
postgres.exe!WaitLatchOrSocket+0x456
postgres.exe!WaitEventSetWait+0x8e
postgres.exe!secure_write+0xeb
postgres.exe!pq_putbytes+0x4e5
postgres.exe!pq_putbytes+0x41b
postgres.exe!pq_putbytes+0x256
postgres.exe!CreateCopyDestReceiver+0x3784
postgres.exe!CreateCopyDestReceiver+0x17b2
postgres.exe!CreateCopyDestReceiver+0x144c
postgres.exe!CreateCopyDestReceiver+0x1098
postgres.exe!DoCopy+0x50a
postgres.exe!standard_ProcessUtility+0x37b
pg_stat_statements.dll!pg_finfo_pg_stat_statements+0xde1
postgres.exe!ProcessUtility+0x34
postgres.exe!PortalRunFetch+0x891
postgres.exe!PortalRunFetch+0xbae
postgres.exe!PortalRun+0x1a0
postgres.exe!get_stats_option_name+0x1335
postgres.exe!PostgresMain+0x635
postgres.exe!ShmemBackendArrayAllocation+0x2b3a
postgres.exe!SubPostmasterMain+0x275
postgres.exe!main+0x480
postgres.exe!pgwin32_popen+0x144f
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21


Any ideas on changes in Postgres 10 that would cause this?  We were previously running 9.6.7 and didn't encounter this issue on that version either.

Thanks
Ben.

Ben Snaidero
Geotab
Senior Database Specialist
Direct+1 (289) 230-7749
Toll-free+1 (877) 436-8221
Visitwww.geotab.com
Twitter|Facebook|YouTube|LinkedIn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux