Re: gin creation and previous history of server

Ivan Sergio Borgonovo <mail@xxxxxxxxxxxxxxx> · Wed, 5 Nov 2008 16:05:24 +0100

On Tue, 04 Nov 2008 10:33:26 -0500
Tom Lane <tgl@xxxxxxxxxxxxx> wrote:

> Ivan Sergio Borgonovo <mail@xxxxxxxxxxxxxxx> writes:
> > It seems that gin creation is triggering something nasty in the
> > server that depends on previous history of the server.
> 
> Can you put together a self-contained test case that illustrates
> this?

I'm trying... Tonight I just let my long transaction run all night.
It has been running for about 10h and it blocked on index
re-creation.

During all this runs:
shared_buffers='120MB';
maintenance_work_mem='200MB';

Stopped the server. dropped and recreated the index outside a
transaction: ~6 minutes.

Stopped the server, created a transaction that just drop and rebuild
the index, still below 10 minutes.

gist index creation outside the transaction ~2min
gist index creation inside the transaction ~2min

After transaction committed but other connection:
vacuum full
drop index
recreate as gin index: ~6min

I even:
begin;
drop index
recreate all tsvectors
create gin index
commit;

and times are still in the ~6min so it must be some other part of
the transaction having side effects.

In one batch (same connection):
vacuum full;
transaction
don't recreate gin inside the transaction
commit
vacuum analyze;
recreate gin: dead!

stopped the server again in one batch (same connection):
vacuum full;
transaction
don't recreate gin inside the transaction
commit
vacuum analyze;
recreate gin: dead!

OK... some clue...
run the batch, close the transaction etc...
** open a new connection **
create the gin index... work!

I noticed that when I open a new connection to build the index CPU
usage doesn't go to 100%.

I'm not completely convinced that it's just a problem of new
connection I've no sufficient statistical data to prove
correlation... but it is hard to collect since the transaction is
long and made of several pieces, and I can record a failure just
after waiting at least 6min for each index rebuild.
I'll try further to see if I can collect more data to see if there
is any chance there is a bug in PostgreSQL, but considering how much
time it is required for each test, you'll have to be patient.

Now I could avoid to drop the index at the beginning of the
transaction. tsvectors are updated with a trigger and I'm disabling
the trigger at the beginning of the transaction, so tsvectors aren't
going to be changed during the transaction.
But several rows are going to be deleted.

I'm wondering if I leave the gin index there, without dropping it...
will it have impact on performances considering that a LOT of rows
are going to be deleted.
So I could just drop and recreate the index in another connection
inside a transaction.

-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general