I'm using gin index for my full text search engine in production. These
days the gin index size grows to 20-30G and the system started to suffer
with periodical insert hanging. This is same as described in the 2 posts:
http://postgresql.1045698.n5.nabble.com/Random-penalties-on-GIN-index-updates-td2073848.html
http://postgresql.1045698.n5.nabble.com/Periodically-slow-inserts-td3230434.html
The gin index is on a dedicated raid 10 SAS disk and the performance
should be enough for normal db operation. But I always see almost 100%
disk utiliztion on the disk when the inserts hang. The utiliztion for
other data(such as the full text table data) on another disk(same setup
as the gin index disk: SAS raid 10) is quite low comparing with the gin
index disk. From my observation, looks too much data is written to the
disk when the pending list of gin index is flushed to the disk. Below is
the outupt of 'iostat -xm 3' on the disk when inserts hang:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await svctm %util
sde 0.00 0.00 0.67 2614.00 0.08 22.94
18.03 32.94 12.61 0.38 100.00
sde 0.00 0.00 1.67 2377.33 0.17 20.43
17.73 32.00 13.44 0.42 100.00
sde 0.00 0.00 15.67 2320.33 0.23 20.13
17.85 31.99 13.73 0.43 100.00
sde 0.00 0.00 7.33 1525.00 0.12 14.02
18.90 32.00 20.83 0.65 100.00
sde 0.00 0.00 14.33 1664.67 0.12 15.54
19.10 32.00 19.06 0.60 100.00
sde 0.00 0.00 5.33 1654.33 0.04 12.07
14.94 32.00 19.22 0.60 100.00
I tried to increase work_mem but the inserts hang more time each time
with less frequency. So it makes almost no difference for the total
hanging time. Frequent vacuum is not a choice since the hang happens
very 3-5 mins. is there any improvement I can make with pg for such data
volumn(still increasing) or it's time to turn to other full text search
solution such as lucene etc?
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance