Re: [GENERAL] Creation of tsearch2 index is very slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 20, 2006 at 04:50:17PM -0500, Tom Lane wrote:
> (hemdistcache calls hemdistsign --- I think gprof is doing something
> funny with tail-calls here, and showing hemdistsign as directly called
> from gtsvector_picksplit when control really arrives through hemdistcache.)

It may be the compiler. All these functions are declared static, which
gives the compiler quite a bit of leeway to rearrange code.

> The bulk of the problem is clearly in this loop, which performs O(N^2)
> comparisons to find the two entries that are furthest apart in hemdist
> terms:

Ah. A while ago someone came onto the list asking about bit strings
indexing[1]. If I'd known tsearch worked like this I would have pointed
him to it. Anyway, before he went off to implement it he mentioned
"Jarvis-Patrick clustering", whatever that means.

Probably more relevent was this thread[2] on -hackers a while back with
pseudo-code[3]. How well it works, I don't know, it worked for him
evidently, he went away happy...

[1] http://archives.postgresql.org/pgsql-general/2005-11/msg00473.php
[2] http://archives.postgresql.org/pgsql-hackers/2005-11/msg01067.php
[3] http://archives.postgresql.org/pgsql-hackers/2005-11/msg01069.php

Hope this helps,
-- 
Martijn van Oosterhout   <kleptog@xxxxxxxxx>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment: signature.asc
Description: Digital signature


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux