Re: Cost-Based Vacuum Delay tuning

Guillaume Cottenceau <gc@xxxxxx> · Fri, 07 Dec 2007 17:44:18 +0100

Erik Jones <erik 'at' myemma.com> writes:

>>   vacuum_cost_delay/vacuum_cost_limit  (deactivated)  20/200
>> 40/200   100/1000   150/1000   200/1000   300/1000
>>
>>         VACUUM ANALYZE time                54 s        112 s    188
>> s    109 s       152 s      190 s      274 s
>>         SELECT time                        50 s         28 s     26
>> s     24 s        22 s       20 s       19 s
>
> While you do mention that the table you're running your select on is
> too big to fit in the shared_buffers, the drop in time between the
> first run and the rest most likely still reflects the fact that when

These figures don't show a difference between first run and
subsequent runs. For each parameter tuning, a couple of runs are
fired after database restart, and once the value is approximately
constant, it's picked and put in this table. The "deactivated"
shows the (stable, from subsequent runs) figure when vacuum delay
is disabled (vacuum_cost_delay parameter quoted), not the first
run, if that's where the confusion came from.

> running those tests successively a good portion of the table will
> already be in shared_buffers as well as being in the filesystem
> cache, i.e. very little of the runs after the first will have to hit

A dd sized at the total RAM size is run between each test (not
between each parameter tuning, between each *query test*), to
remove the OS disk cache effect. Of course, the PostgreSQL
caching effect cannot be removed (maybe, it shouldn't, as after
all this caching is here to improve performance), but the query
is selected to generate a lot of disk activity even between each
run (that's why I said "a query which cannot fit entirely in
buffers").

> the disk much.

I have of course checked that the subsequent runs mean
essentially disk activity, not CPU activity.

>> - alternatively, pg could make use of some longer term statistics
>>   (load average, IO statistics) to dynamically pause the
>>   vacuuming - this I guess is related to the host OS and probably
>>   more difficult to have working correctly with multiple disks
>>   and/or processes running - however, if you want high
>>   performance from PostgreSQL, you probably won't host other IO
>>   applications on the same disk(s)
>
> These ideas have been discussed much.  Look in the archives to the
> beginning of this year.  I think the general consensus was that it

Is it on pgsql-hackers? I haven't found much stuff in
pgsql-performance while looking for "vacuum_cost_delay tuning".

> would be good have multiple autovacuum workers that could be tuned
> for different times or workloads.  I know Alvarro was going to work

Sounds interesting.

>> I'm wondering if it would not be possible to dynamically ignore
>> (or lower, if it makes more sense?) the Cost-Based Vacuum Delay
>> during vacuum full, if a configurable amount of queries are
>> waiting for the lock?
>>
>> (please save yourself from answering "you should never run VACUUM
>> FULL if you're vacuuming enough" - as long as VACUUM FULL is
>> available in PostgreSQL, there's no reason to not make it as
>> practically usable as possible, albeit with low dev priority)
>
> Ok, I won't say what you said not to say.  But, I will say that I
> don't agree with you're conjecture that VACUUM FULL should be made
> more lightweight, it's like using dynamite to knock a whole in a wall
> for a window.

Thanks for opening a new kind of trol^Hargument against VACUUM
FULL, that one's more fresh (at least to me, who doesn't follow
the list too close anyway).

Just for the record, I inherited a poorly (actually, "not" would
be more appropriate) tuned database, containing more than 90% of
dead tuples on large tables, and I witnessed quite some
performance improvement while I could fix that.

-- 
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend