On Wed, Jan 29, 2020 at 10:37 PM Nicola Contu <nicola.contu@xxxxxxxxx> wrote: > This is the error on postgres log of the segmentation fault : > > 2020-01-21 14:20:29 GMT [] [42222]: [108-1] db=,user= LOG: server process (PID 2042) was terminated by signal 11: Segmentation fault > 2020-01-21 14:20:29 GMT [] [42222]: [109-1] db=,user= DETAIL: Failed process was running: select pid from pg_stat_activity where query ilike 'REFRESH MATERIALIZED VIEW CONCURRENTLY matview_vrs_request_stats' > 2020-01-21 14:20:29 GMT [] [42222]: [110-1] db=,user= LOG: terminating any other active server processes Ok, this is a bug. Do you happen to have a core file? I don't recall where CentOS puts them. > > If you're on Linux, you can probably see them with "ls /dev/shm". > > I see a lot of files there, and doing a cat they are empty. What can I do with them? Not much, but it tells you approximately how many 'slots' are in use at a given time (ie because of currently running parallel queries), if they were created since PostgreSQL started up (if they're older ones they could have leaked from a crashed server, but we try to avoid that by trying to clean them up when you restart). > Those are two different problems I guess, but they are related because right before the Segmentation Fault I see a lot of shared segment errors in the postgres log. That gave me an idea... I hacked my copy of PostgreSQL to flip a coin to decide whether to pretend there are no slots free (see below), and I managed to make it crash in the regression tests when doing a parallel index build. It's late here now, but I'll look into that tomorrow. It's possible that the parallel index code needs to learn to cope with that. #2 0x0000000000a096f6 in SharedFileSetInit (fileset=0x80b2fe14c, seg=0x0) at sharedfileset.c:71 #3 0x0000000000c72440 in tuplesort_initialize_shared (shared=0x80b2fe140, nWorkers=2, seg=0x0) at tuplesort.c:4341 #4 0x00000000005ab405 in _bt_begin_parallel (buildstate=0x7fffffffc070, isconcurrent=false, request=1) at nbtsort.c:1402 #5 0x00000000005aa7c7 in _bt_spools_heapscan (heap=0x801ddd7e8, index=0x801dddc18, buildstate=0x7fffffffc070, indexInfo=0x80b2b62d0) at nbtsort.c:396 #6 0x00000000005aa695 in btbuild (heap=0x801ddd7e8, index=0x801dddc18, indexInfo=0x80b2b62d0) at nbtsort.c:328 #7 0x0000000000645b5c in index_build (heapRelation=0x801ddd7e8, indexRelation=0x801dddc18, indexInfo=0x80b2b62d0, isreindex=false, parallel=true) at index.c:2879 #8 0x0000000000643e5c in index_create (heapRelation=0x801ddd7e8, indexRelationName=0x7fffffffc510 "pg_toast_24587_index", indexRelationId=24603, parentIndexRelid=0, I don't know if that's the bug that you're hitting, but it definitely could be: REFRESH MATERIALIZED VIEW could be rebuilding an index. === diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c index 90e0d739f8..f0b49d94ee 100644 --- a/src/backend/storage/ipc/dsm.c +++ b/src/backend/storage/ipc/dsm.c @@ -468,6 +468,13 @@ dsm_create(Size size, int flags) nitems = dsm_control->nitems; for (i = 0; i < nitems; ++i) { + /* BEGIN HACK */ + if (random() % 10 > 5) + { + nitems = dsm_control->maxitems; + break; + } + /* END HACK */ if (dsm_control->item[i].refcnt == 0) { dsm_control->item[i].handle = seg->handle;