Hi,
I have a set of relatively complex queries producing tables (including postgis and pgpointcloud functions) that I run consecutively, and sometimes (depending on the base-data) my 5th query stalls (CPU 100%, runs forever) seemingly on the percentile_cont function. When I replace percentile_cont with just a value it passes.
The setup roughly looks like this:
query1 - creates 77 records with polygons
query2 - creates 89 records with polygons
query3 - creates ~350k records with points (inside above polygons)
query4 - creates ~220k records with points clustered by height (from result query3)
query5 - creates ~102k records with point clustered by normal (from result query3)
The odd thing is, when I run query5 directly after query4, it will stall on some datasets (always same sets). Though when I cancel the query and run it again, it will pass in about 2 seconds.
When I run query5 from a different shell directly after query4, it also passes.
Does anyone have a clue on how to research this further? How can I look into the process itself to see what is cycling the CPU? How to break this problem down to smaller chunks? I am a bit out of options myself.
For what it's worth:
SELECT version();
"PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406, 64-bit"
SELECT postgis_full_version();
"POSTGIS="2.5.0beta1dev r16609" [EXTENSION] PGSQL="100" GEOS="3.6.2-CAPI-1.10.2 4d2925d6" SFCGAL="1.3.1" PROJ="Rel. 5.1.0, June 1st, 2018" GDAL="GDAL 2.3.1, released 2018/06/22" LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.2.1" RASTER"
The part of query5 where I replace percentile_cont with just a value:
SELECT
c1.id,c1.cid, c1.blockid,
CASE
WHEN aspectmedian > 360 THEN aspectmedian - 360
ELSE aspectmedian
END as aspect,
pt geom
FROM clustered c1
JOIN (
SELECT cid, blockid, aspectclass,
--find median aspect and median height within bucket
--percentile_cont(0.5) within group (order by aspect) as aspectmedian
c1.id,c1.cid, c1.blockid,
CASE
WHEN aspectmedian > 360 THEN aspectmedian - 360
ELSE aspectmedian
END as aspect,
pt geom
FROM clustered c1
JOIN (
SELECT cid, blockid, aspectclass,
--find median aspect and median height within bucket
--percentile_cont(0.5) within group (order by aspect) as aspectmedian
0 as aspectmedian --replaced by value 0
FROM clustered
GROUP BY blockid,cid, aspectclass
) c2 ON (c1.cid = c2.cid AND c1.blockid = c2.blockid AND c1.aspectclass = c2.aspectclass)
WHERE c1.cid Is Not Null;
GROUP BY blockid,cid, aspectclass
) c2 ON (c1.cid = c2.cid AND c1.blockid = c2.blockid AND c1.aspectclass = c2.aspectclass)
WHERE c1.cid Is Not Null;
Best,
Tom