Re: Gist indexing performance with cidr types

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 26 Aug 2015, Emre Hasegeli wrote:

Are the coverage operatons just that expensive?

They shouldn't be.  A similar query like yours works in 0.5 second on my laptop:
[snip]

I get the same from your testcase.

Maybe, something we haven't expected about your dataset causes a
performance regression on the index.  Did you see anything relevant on
the server logs on index creation time?

I tried dropping and re-creating the index. The only log entry was for the drop statement.

The distribution of the data is not uniform like the data set you produce. Though I find it hard to believe that it would affect this as much.

select masklen(route), count(*) from routes group by masklen(route);

 masklen |  count
---------+---------
       8 |      47
       9 |      30
      10 |      84
      11 |     225
      12 |     580
      13 |    1163
      14 |    2401
      15 |    4530
      16 |   32253
      17 |   20350
      18 |   35583
      19 |   76307
      20 |  111913
      21 |  132807
      22 |  229578
      23 |  286986
      24 | 1149793

Rest is rather small, though with bumps at /32 and /48 (typical IPv6 prefix length).

Real-world address space is very fragmented, where as some is unused.

Then there is the mixed IPv6 and IPv4 data that might factor in.


I tried the approach from your benchmark, to try make a more isolated test case:

irrexplorer=> SELECT DISTINCT route INTO hmm FROM routes_view WHERE asn = 2914;
SELECT 732

irrexplorer=> explain analyze select routes.route from routes join hmm on routes.route && hmm.route;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.41..511914.27 rows=2558 width=7) (actual time=8.096..17209.778 rows=8127 loops=1)
   ->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual time=0.010..0.609 rows=732 loops=1)
   ->  Index Only Scan using route_gist on routes  (cost=0.41..470.32 rows=22900 width=7) (actual time=4.823..23.502 rows=11 loops=732)
         Index Cond: (route && (hmm.route)::inet)
         Heap Fetches: 0
 Planning time: 0.971 ms
 Execution time: 17210.627 ms
(7 rows)

The only difference in the query plan is that the above used an index only, where as your test case used index scan (it did this for me as well). I tried without index only scan:

irrexplorer=> set enable_indexonlyscan =false;
SET

                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.41..571654.27 rows=2558 width=7) (actual time=6.406..15899.791 rows=8127 loops=1)
   ->  Seq Scan on hmm  (cost=0.00..11.32 rows=732 width=7) (actual time=0.011..0.615 rows=732 loops=1)
   ->  Index Scan using route_gist on routes  (cost=0.41..551.93 rows=22900 width=7) (actual time=4.490..21.712 rows=11 loops=732)
         Index Cond: ((route)::inet && (hmm.route)::inet)
 Planning time: 0.505 ms
 Execution time: 15900.669 ms
(6 rows)

Slight faster, but nothing significant. Something seems wonky.


    Best regards, Henrik



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux