I haven't looked at pg's code for creating an index, but seriously suspect it's doing an extern sort then insert into the index. Such operations would have 100% correlation from the index insert point of view and the "sort" could be in memory or the tape variety (more efficient i/o pattern). Summary, # of indexes, index correlation, pg's multi versioning, shared_buffers and checkpoint_segments are interconnected in weird and wonderful ways... Seldom have found "simple" solutions to performance problems. Marc > -----Original Message----- > From: pgsql-performance-owner@xxxxxxxxxxxxxx=20 > [mailto:pgsql-performance-owner@xxxxxxxxxxxxxx] On Behalf Of=20 > Aaron Turner > Sent: Friday, February 10, 2006 3:17 AM > To: pgsql-performance@xxxxxxxxxxxxxx > Subject: [PERFORM] 10+hrs vs 15min because of just one index >=20 > So I'm trying to figure out how to optimize my PG install=20 > (8.0.3) to get better performance without dropping one of my indexes. >=20 > Basically, I have a table of 5M records with 3 columns: >=20 > pri_key (SERIAL) > data char(48) > groupid integer >=20 > there is an additional unique index on the data column. >=20 > The problem is that when I update the groupid column for all=20 > the records, the query takes over 10hrs (after that I just=20 > canceled the update). Looking at iostat, top, vmstat shows=20 > I'm horribly disk IO bound (for data not WAL, CPU 85-90%=20 > iowait) and not swapping. >=20 > Dropping the unique index on data (which isn't used in the=20 > query), running the update and recreating the index runs in=20 > under 15 min.=20 > Hence it's pretty clear to me that the index is the problem=20 > and there's really nothing worth optimizing in my query. >=20 > As I understand from #postgresql, doing an UPDATE on one=20 > column causes all indexes for the effected row to have to be=20 > updated due to the way PG replaces the old row with a new one=20 > for updates. This seems to explain why dropping the unique=20 > index on data solves the performance problem. >=20 > interesting settings: > shared_buffers =3D 32768 > maintenance_work_mem =3D 262144 > fsync =3D true > wal_sync_method =3D open_sync > wal_buffers =3D 512 > checkpoint_segments =3D 30 > effective_cache_size =3D 10000 > work_mem =3D <default> (1024 i think?) >=20 > box: > Linux 2.6.9-11EL (CentOS 4.1) > 2x Xeon 3.4 HT > 2GB of RAM (but Apache and other services are running) > 4 disk raid 10 (74G Raptor) for data > 4 disk raid 10 (7200rpm) for WAL >=20 > other then throwing more spindles at the problem, any suggestions? >=20 > Thanks, > Aaron >=20 > -- > Aaron Turner > http://synfin.net/ >=20 > ---------------------------(end of=20 > broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >=20