Re: GiST index performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 25 Feb 2010, Bruce Momjian wrote:
Was there every any conclusion on this issue?

Not really. Comments inline:

Matthew Wakeling wrote:
Revisiting the thread a month back or so, I'm still investigating
performance problems with GiST indexes in Postgres.

Looking at http://wiki.postgresql.org/wiki/PostgreSQL_8.4_Open_Items I'd
like to clarify the contrib/seg issue. Contrib/seg is vulnerable to
pathological behaviour which is fixed by my second patch, which can be
viewed as complete. Contrib/cube, being multi-dimensional, is not affected
to any significant degree, so should not need alteration.

This issue is addressed by my patch, which AFAIK noone has reviewed. However, that patch was derived from a patch that I applied to bioseg, which is itself a derivative of seg. This patch works very well indeed, and gave an approximate 100 times speed improvement in the one test I ran.

So you could say that the sister patch of the one I submitted is tried and tested in production.

A second quite distinct issue is the general performance of GiST indexes
which is also mentioned in the old thread linked from Open Items. For
that, we have a test case at
http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.php for
btree_gist indexes. I have a similar example with the bioseg GiST index. I
have completely reimplemented the same algorithms in Java for algorithm
investigation and instrumentation purposes, and it runs about a hundred
times faster than in Postgres. I think this is a problem, and I'm willing
to do some investigation to try and solve it.

I have not made any progress on this issue. I think Oleg and Teodor would be better placed working it out. All I can say is that I implemented the exact same indexing algorithm in Java, and it performed 100 times faster than Postgres. Now, Postgres has to do a lot of additional work, like mapping the index onto disc, locking pages, and abstracting to plugin user functions, so I would expect some difference - I'm not sure 100 times is reasonable though. I tried to do some profiling, but couldn't see any one section of code that was taking too much time. Not sure what I can further do.

Matthew

--
Some people, when confronted with a problem, think "I know, I'll use regular
expressions." Now they have two problems.                  -- Jamie Zawinski

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux