Re: Hash Anti Join performance degradation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,


Kevin Grittner wrote:
> 
>> Is there a way to determine the values actually used?
> The pg_settings view.  Try the query shown here:
> http://wiki.postgresql.org/wiki/Server_Configuration
> 
Thanks Kevin, very usful. Here is the output:

"version";"PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 64-bit"
"bytea_output";"escape"
"client_encoding";"UNICODE"
"effective_cache_size";"4GB"
"lc_collate";"German_Germany.1252"
"lc_ctype";"German_Germany.1252"
"listen_addresses";"*"
"log_destination";"stderr"
"log_line_prefix";"%t "
"logging_collector";"on"
"max_connections";"100"
"max_stack_depth";"2MB"
"port";"5432"
"server_encoding";"UTF8"
"shared_buffers";"1GB"
"temp_buffers";"4096"
"TimeZone";"CET"
"work_mem";"1GB"


Craig Ringer wrote:
> 
> On 05/26/2011 12:42 AM, panam wrote:
> It's a bit beyond me, but I suspect that it'd be best if you could hang 
> onto the dump file in case someone has the time and enthusiasm to 
> investigate it. I take it you can't distribute the dump file, even 
> privately?
> 
Fortunately, I managed to reduce it to the absolute minimum (i.e. only
meaningless ids), and the issue is still observable.
You can download it from here:
http://www.zumodrive.com/file/460997770?key=cIdeODVlNz

Some things to try:
* tune your psql settings if you want
* reindex, vaccum analzye if you want

"Patholgical" query:

select
	b.id,
	(SELECT
		m1.id 
	FROM
		message m1 
	LEFT JOIN
		message m2 
			ON (
				m1.box_id = m2.box_id 
				AND m1.id < m2.id
			) 
	WHERE
		m2.id IS NULL 
		AND m1.box_id = b.id)
from
	box b

=> takes almost "forever" (~600 seconds on my system)

Try

delete from message where id > 2550000;

=> deletes 78404 rows
Do the "pathological" query again
=> speed is back (~4 seconds on my system)

Replay the dump
Try

delete from message where id < 1000000;

=> deletes 835844 (10 times than before) rows. Maybe you can delete many
more, I haven't tested this systematically.
Do the "pathological" query again
=> takes almost "forever" (didn't wait...)

Replay the dump
Cluster:

cluster message_pkey on message;

Do the "pathological" query again
=> speed is back (~3 seconds on my system)

Any third party confirmation?

Thanks
panam

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4428435.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux