Re: DB Performance decreases due to often written/accessed table

"Merlin Moncure" <mmoncure@xxxxxxxxx> · Thu, 19 Oct 2006 13:32:22 -0400

On 10/19/06, Jens Schipkowski <jens.schipkowski@xxxxxxxxxx> wrote:
// select finds out which one has not an twin
// a twin is defined as record with the same attr* values
// decreases speed over time until timeout by postgresql
SELECT *
 FROM tbl_reg reg
WHERE register <> loc1 AND
        idreg NOT IN
                (
                SELECT reg.idreg
                FROM tbl_reg reg, tbl_reg regtwin
                WHERE regtwin.register = 1 AND
                        regtwin.type <> 20 AND
                        reg.attr1 = regtwin.attr1 AND
                        reg.attr2 = regtwin.attr2 AND
                        reg.attr3 = regtwin.attr3 AND
                        reg.attr4 = regtwin.attr4 AND
                        reg.attr5 = regtwin.attr5 AND
                        reg.attr6 = regtwin.attr6 AND
                        reg.idreg <> regtwin.idreg AND
                        reg.register = 2
                );

[...]

We have the problem, that we cannot see any potential to improve SQL
statements. Indexing the attr* columns seems not to be an solution,
because the data mustn't be unique (twins) and changes really often so
reindexing will took too long.

1. your database design is the real culprit here.  If you want things
to run really quickly, solve the problem there by normalizing your
schema. denomalization is the root cause of many, many, problems
posted here on this list.
2. barring that, the above query will run fastest by creating
multi-column indexes on regtwin (attr*) fields. and reg(attr*).  the
real solution to problems like this is often proper idnexing,
especially multi column.  saying indexes take to long to build is like
saying: 'i have a problem, so i am going to replace it with a much
worse problem'.
3. try where exists/not exists instead of where in/not in

merlin