Re: Very big insert/join performance problem (bacula)

Richard Huxton <dev@xxxxxxxxxxxx> · Tue, 14 Jul 2009 09:23:25 +0100

Marc Cousin wrote:

Temporarily I moved the problem at a bit higher sizes of batch by changing 
random_page_cost to 0.02 and seq_page_cost to 0.01, but I feel like an 
apprentice sorcerer with this, as I told postgreSQL that fetching rows from 
disk are much cheaper than they are. These values are, I think, completely 
abnormal.

They certainly don't have anything to do with reality. Try putting them 
back to (say) seq_page_cost=1 and random_page_cost=2.

So, finally, to my questions :
- Is it normal that PostgreSQL is this off base on these queries (sorry I 
don't have the plans, if they are required I'll do my best to get some, but 
they really are the two obvious plans for this kind of query). What could 
make it choose the hash join for too small batch tables ?

No point in speculating without plans.

- Is changing the 2 costs the way to go ?

Not the way you have.

- Is there a way to tell postgreSQL that it's more costly to sort than it 
thinks ? (instead of telling it that fetching data from disk doesn't cost 
anything).

That's what the configuration settings do. But if you put a couple way 
off from reality it'll be pure chance if it gets any estimates right.

Here are the other non-default values from my configuration :

shared_buffers = 2GB
work_mem = 64MB

Set this *much* higher when you are running your bulk imports. You can 
do it per-connection. Try 256MB, 512MB, 1GB (but keep an eye on total 
memory used).

maintenance_work_mem = 256MB
max_fsm_pages = 15000000 # There are quite big deletes with bacula ...
effective_cache_size = 800MB

See other emails on this one.

default_statistics_target = 1000

Probably don't need this for all columns, but it won't cause problems 
with these queries.

--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance