Re: INSERTs becoming slower and slower

Christopher Kings-Lynne <chriskl@xxxxxxxxxxxxxxxxxxx> · Thu, 08 Dec 2005 16:44:03 +0800

You might find it faster to install contrib/tsearch2 for text indexing 
sort of purposes...

Nörder-Tuitje wrote:

Hi,

I am breaking up huge texts (between 25K and 250K words) into single 
words using PgPlsql.

For this I am using a temp table in the first step :

        LOOP   

                vLeft   := vRight;
                vTmp    := vLeft;

                LOOP
                        vChr := SUBSTRING ( pText FROM vTmp FOR 1);
                        vTmp := vTmp + 1;
                        EXIT WHEN (vChr = ' ' OR vChr IS NULL OR vTmp = 
cBorder);
                END LOOP;

                vRight  := vTmp;

                vLit    := SUBSTRING(pText FROM vLeft FOR (vRight - 
vLeft - 1));

                IF (LENGTH(vLit) > 0) THEN
                        WRDCNT := WRDCNT +1;
                        INSERT INTO DEX_TEMPDOC(TMP_DOO_ID
                                                ,       TMP_SEQ_ID
                                                ,       TMP_RAWTEXT)
                        VALUES             (pDOO_ID
                                                ,       I
                                                ,       vLIT
                                                    ); 
                END IF;

                I := I + 1;
                vTmp := LENGTH(vLIT);

                IF ((WRDCNT % 100) = 0) THEN
                        PROGRESS = ROUND((100 * I) / DOCLEN,0);
                        RAISE NOTICE '[PROC] % WORDS -- LAST LIT % (Len 
%) [% PCT / % of %]', WRDCNT, vLIT, vTMP, PROGRESS, I, DOCLEN;

                END IF;

                EXIT WHEN vRight >= cBorder;
        END LOOP;

The doc is preprocessed, between each word only a single blank can be.

My problem is : The first 25K words are quite quick, but  the insert 
become slower and slower. starting with 1K words per sec I end up with 
100 words in 10 sec (when I reach 80K-100K words)

the only (nonunique index) on tempdoc is on RAWTEXT.

What can I do ? Should I drop the index ?

Here is my config:

shared_buffers = 2000           # min 16, at least max_connections*2, 
8KB each
work_mem = 32768                # min 64, size in KB
maintenance_work_mem = 16384    # min 1024, size in KB
max_stack_depth = 8192          # min 100, size in KB

enable_hashagg = true
enable_hashjoin = true
enable_indexscan = true
enable_mergejoin = true
enable_nestloop = true
enable_seqscan = false

The machine is a XEON 3GHz, 1GB RAM, SATA RAID 1 Array running 8.0.4 
i686 precompiled

Thanks !

Mit freundlichen Grüßen
*Dipl.Inform.Marcus Noerder-Tuitje
**Entwickler
*
software technology AG
*Kortumstraße 16   *
*44787 Bochum*
*Tel:  0234 / 52 99 6 26*
*Fax: 0234 / 52 99 6 22*
*E-Mail:   noerder-tuitje@xxxxxxxxxxxxx  *
*Internet: www.technology.de *