Re: gin index creation performance problems

Oleg Bartunov <oleg@xxxxxxxxxx> · Mon, 3 Nov 2008 19:03:18 +0300 (MSK)

On Mon, 3 Nov 2008, Ivan Sergio Borgonovo wrote:

I'm looking for a bit more guidance on gin index creation.

The process:
- vaccum analyze.
- start a transaction that:
- drop the triggers to update a tsvector
- drop the index on the tsvector
- fill several tables
- update the tsvector in a table with ~800K records
- recreate the gin index
- commit

To have a rough idea of the data involved:
- 800K record
- tsvector formed from concatenation of 6 fields
- total length of concatenated fields ~ 200 chars *
- average N of lexemes in tsvector 10 *
[*] guessed

2xXeon HT 3.2GHz, 4Gb RAM, SCSI RAID5

Index creation takes more than 1h.

maintenance_work_mem is still untouched. What would be a good value
to start from?
Anything else to do to improve performances?

why you didn't change maintenance_work_mem ? You can change it online just
before create index. Bulk gin index creation uses it as a buffer and you can 
save a lot of IO.

All this written in the documentation and there are other parameters you 
should be concerned about.

	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general