Re: Hash Anti Join performance degradation

panam <panam@xxxxxxx> · Tue, 24 May 2011 07:34:57 -0700 (PDT)

Hi Craig and CÃdric,

Thanks for the very informative introduction to the netiquette here and
thanks for sharing your time.
I wasn't aware of http://explain.depesz.com/, very useful.
So, here are the query plans:
http://explain.depesz.com/s/6AU (1st from previous post, good)
http://explain.depesz.com/s/YPS (2nd from previous post, bad)

> The usual cause is that the statistics for estimated row counts cross a 
> threshold that makes the query planner think that a different kind of 
> plan will be faster.

Hm, as far as i understand the plans, they are equivalent, aren't they?

> If the query planner is using bad information about the performance of 
> the storage, then it will be making bad decisions about which approach 
> is faster. So the usual thing to do is to adjust seq_page_cost and 
> random_page_cost to more closely reflect the real performance of your 
> hardware, and to make sure that effective_cache_size matches the real 
> amount of memory your computer has free for disk cache use.

Will this make any difference even when the plans are equivalent as assumed
above?

The table creation SQL is as follows:
http://pastebin.com/qFDUP7Aa (Message table); ~ 2328680	rows, is growing
constantly (~ 10000 new rows each day), 
http://pastebin.com/vEmh4hb8 (Box table); ~ 128	rows (growing very slowly 1
row every two days, each row updated about 2x a day)

The DB contains the same data, except that for the "good" query, the last
10976 rows (0.4%) of message are removed by doing a

DELETE FROM message where timestamp > TO_DATE ('05/23/2011','mm/dd/yyyy');

This speeds up the query by a factor of ~27. (207033.081 (bad) vs. 7683.978
(good)).

Each query was run before and after a vacuum analyze, one time to create
appropriate statistics, and the second time to do the actual measurement.
All tests were made on the dev-machine, which is a 8GB, Core i7, Windows 7

I experienced the issue at first on the "production"-environment, which is a
64-bit Ubuntu, running PostgreSQL 9.0.1 on x86_64-unknown-linux-gnu,
compiled by GCC gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-46), 64-bit,
and later for analysis on the dev-environment, which is a
64-bit Windows 7, running PostgreSQL 9.0.4, compiled by Visual C++ build
1500, 64-bit
For testing, I've increased the buffers that I judge important for the issue
to the following values:
effective_cache_size: 4GB
shared_buffers: 1GB
work_mem: 1GB
temp_buffers: 32MB
After that, configuration was reloaded and the postgresql service was
restarted using pgAdmin.
Interestingly, there was no performance gain as compared to the default
settings, the "bad" query even took about 30 seconds (15%) longer.
As well it seems, all data fit into memory, so there is not much disk I/O
involved.

@CÃdric
> did you have log of vacuum and checkpoint activity ?
> (no vacuum full/cluster or such thing running ?)
There is no clustering involved here, its a pretty basic setup.
How can I obtain the information you require here? I could send you the
output of the analyse vacuum command from pgAdmin, but is there a way to
make it output the information in English (rather than German)? 

Thanks for your interest in this issue.

Regards,
panam

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4422247.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance