optimizer behavior in the case of highly updated tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am looking for some specific information regarding optimizer behavior.

We recently experienced a situation where a query that was previously using a btree lookup (efficient) SWITCHED to using seqscan/hash lookup.

 

My questions would be:

 

-          Under what circumstances is the optimizer likely to CHANGE behavior from using a btree index lookup to using a seq scan/hash lookup?

-          What are the critical decision factors that would feed into the optimizer making such a change?

-          Is it possible to measure any metrics in a way that would enable a prediction of such a change?

 

Platform

 

-          8.3.10 (64bit) on RHEL5.

-          Linux xxxxx 2.6.18-164.10.1.el5xen #1 SMP Thu Jan 7 20:28:30 EST 2010 x86_64 x86_64 x86_64 GNU/Linux

 

Application

The table in question is:

-          30m rows, variable length (contains varchar fields), rowlength avg about 120B

-          Approx. 3m unique values in the index column

 

Activity on the table would be, per row:  “Insert, multiple updates, delete after 90 days”

 

We vacuum analyze this table once/weekly.

No partitions are used.

 

Our experience which prompts this question was as follows:

-          If the table is not “vacuum analyze’d” at least once/week, the query plans become unpredictable as to whether they will use btree or seqscan/hash lookup

-          Until last week, “vacuum analyze” was sufficient

-          Friday evening of last week, the query plan for selected queries against this index changed again, but “vacuum analyze” was insufficient

-          Rebuilding index on primary key and on the column index was insufficient

-          It was necessary to take a site outage and perform a “vacuum full analyze” on the table

-          Following this, the query plan reverted to the more efficient btree lookup

 

Clearly, the garbage buildup resulting from transaction activity on the table is the villain here.

-          Is it possible to calculate expected space usage given row count and average row size

-          At what point might the ratio of “expected”/”actual” space usage be able to indicate the need to perform “full vacuum”, or similar maintenance

 

Any observations/comments that anyone would care to make are welcome.

Thanks in advance for your time

Mr

 

 

 

 


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux