Re: Full text search - query plan? PG 8.4.1

Scott Marlowe <scott.marlowe@xxxxxxxxx> · Fri, 23 Oct 2009 16:08:44 -0600

On Fri, Oct 23, 2009 at 2:32 PM, Jesper Krogh <jesper@xxxxxxxx> wrote:
> Tom Lane wrote:
>> Jesper Krogh <jesper@xxxxxxxx> writes:
>>> Tom Lane wrote:
>>>> ... There's something strange about your tsvector index.  Maybe
>>>> it's really huge because the documents are huge?
>>
>>> huge is a relative term, but length(ts_vector(body)) is about 200 for
>>> each document. Is that huge?
>>
>> It's bigger than the toy example I was trying, but not *that* much
>> bigger.  I think maybe your index is bloated.  Try dropping and
>> recreating it and see if the estimates change any.
>
> I'm a bit reluctant to dropping it and re-creating it. It'll take a
> couple of days to regenerate, so this should hopefully not be an common
> situation for the system.

Note that if it is bloated, you can create the replacement index with
a concurrently created one, then drop the old one when the new one
finishes.  So, no time spent without an index.

> I have set the statistics target to 1000 for the tsvector, the
> documentation didn't specify any heavy negative sides of doing that and
> since that I haven't seen row estimates that are orders of magnitude off.

It increases planning time mostly.  Also increases analyze times but
not that much.

> It is build from scratch using inserts all the way to around 10m now,
> should that result in index-bloat? Can I inspect the size of bloat
> without rebuilding (or similar locking operation)?

Depends on how many lost inserts there were.  If 95% of all your
inserts failed then yeah, it would be bloated.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance