Re: Gist indexing performance with cidr types

Jeff Janes <jeff.janes@xxxxxxxxx> · Wed, 26 Aug 2015 09:08:01 -0700

On Wed, Aug 26, 2015 at 4:29 AM, Henrik Thostrup Jensen <htj@xxxxxxxxx> wrote:
On Wed, 26 Aug 2015, Emre Hasegeli wrote:

Are the coverage operatons just that expensive?

They shouldn't be.  A similar query like yours works in 0.5 second on my laptop:

[snip]

I get the same from your testcase.

Maybe, something we haven't expected about your dataset causes a

performance regression on the index.  Did you see anything relevant on

the server logs on index creation time?

I tried dropping and re-creating the index. The only log entry was for the drop statement.

The distribution of the data is not uniform like the data set you produce. Though I find it hard to believe that it would affect this as much.

select masklen(route), count(*) from routes group by masklen(route);

Any chance you can share the actual underlying data?  I noticed it wasn't on github, but is that because it is proprietary, or just because you don't think it is interesting?

irrexplorer=> explain analyze select routes.route from routes join hmm on routes.route && hmm.route;

                                                               QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------

 Nested Loop  (cost=0.41..511914.27 rows=2558 width=7) (actual time=8.096..17209.778 rows=8127 loops=1)

   ->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual time=0.010..0.609 rows=732 loops=1)

   ->  Index Only Scan using route_gist on routes  (cost=0.41..470.32 rows=22900 width=7) (actual time=4.823..23.502 rows=11 loops=732)

         Index Cond: (route && (hmm.route)::inet)

         Heap Fetches: 0

 Planning time: 0.971 ms

 Execution time: 17210.627 ms

(7 rows)

If you loop over the 732 rows yourself, issuing the simple query against each retrieved constant value:

explain (analyze,buffers) select routes.route from routes where route && $1

Does each one take about the same amount of time, or are there some outlier values which take much more time than the others?

Cheers,

Jeff