Re: Improve PostGIS performance with 62 million rows?

Rémi Cura <remi.cura@xxxxxxxxx> · Mon, 9 Jan 2017 21:53:36 +0100

Hey,
I like your curiosity ! 

At the billion range, you __have__ to use pgpointcloud,
pyramid raster solution (actually the more common way to perform this task)
or another database (hello monetdb).
Cheers,
Rémi-C

2017-01-09 20:11 GMT+01:00 Jonathan Vanasco <postgres@xxxxxxxx>:

On Jan 9, 2017, at 12:49 PM, Israel Brewster wrote:

>  Planning time: 4.554 ms

>  Execution time: 225998.839 ms

> (20 rows)

>

> So a little less than four minutes. Not bad (given the size of the database), or so I thought.

>

> This morning (so a couple of days later) I ran the query again without the explain analyze to check the results, and noticed that it didn't take anywhere near four minutes to execute. So I ran the explain analyze again, and got this:

...

>  Planning time: 0.941 ms

>  Execution time: 9636.285 ms

> (20 rows)

>

> So from four minutes on the first run to around 9 1/2 seconds on the second. Presumably this difference is due to caching? I would have expected any caches to have expired by the time I made the second run, but the data *is* static, so I guess not. Otherwise, I don't know how to explain the improvement on the second run - the query plans appear identical (at least to me). *IS* there something else (for example, auto vacuum running over the weekend) that could explain the performance difference?

This may sound crazy, but I suggest running each of these scenarios 3+ times:

        # cold explain

        stop postgres

        start postgres

        explain analyze SELECT

        # cold select

        stop postgres

        start postgres

        enable \t for query timing

        SELECT

        # cold explain to select

        stop postgres

        start postgres

        explain analyze SELECT

        enable \t for query timing

        SELECT

        # cold select to explain

        stop postgres

        start postgres

        enable \t for query timing

        SELECT

        explain analyze SELECT

        # cold select to select

        stop postgres

        start postgres

        enable \t for query timing

        SELECT

        SELECT

I've found the timing for "Explain Analyze" to be incredibly different from an actual SELECT on complex/large dataset queries... and the differences don't seem to correlate to possible speedups from index/table caching.

--

Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)

To make changes to your subscription:

http://www.postgresql.org/mailpref/pgsql-general