Seeking reason behind performance gain in 12 with HashAggregate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

I'm testing an upgrade from Postgres 9.6.16 to 12.1 and seeing a significant performance gain in one specific query. This is really great, but I'm just looking to understand why. Reading through the release notes across all the new versions (10, 11, 12) hasn't yielded an obvious cause, but maybe I missed something. Also, I realize it could be related to other factors (config parameters, physical hosts, etc), but the systems are pretty similar so just wondering about Postgres changes.

The query is the following:

SELECT pvc.value, SUM(pvc.count) AS sum
FROM
(SELECT (ST_ValueCount(cv.rast, 1)).*
FROM calveg_whrtype_20m AS cv) AS pvc
GROUP BY pvc.value 

Here is the EXPLAIN (ANALYZE ON, BUFFERS ON) output from both systems:


In the 9.6 plan, the Seq Scan node produced 15,812 rows. 
In the 12 plan, the Seq Scan produced 2,502 rows, and then the ProjectSet node produced 15,812 rows. 

Note that the table (calveg_whrtype_20m) in the two databases have the same number of rows (2,502).

So it seems something about the introduction of the ProjectSet node between the Seq Scan and HashAggregate is optimizing things...? Is this the right conclusion to draw and if so, why might this be happening? Is there something that was changed/improved in either 10, 11 or 12 that this behavior can be attributed to? 

Two more notes -- 

1. If I run the inner subquery without the outer sum/group by, the plans between the two systems are identical.

2. As the calgeg_whrtype_20m table is a raster, I started my question on the PostGIS list, but there was no obvious answer that the gain is related to a change in the PostGIS code so I'm now turning to this list. 

Thank you,
Shira


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux