2011/5/26 panam <panam@xxxxxxx>: > Hi there, > > > Kevin Grittner wrote: >> >>> Is there a way to determine the values actually used? >> The pg_settings view. Try the query shown here: >> http://wiki.postgresql.org/wiki/Server_Configuration >> > Thanks Kevin, very usful. Here is the output: > > "version";"PostgreSQL 9.0.4, compiled by Visual C++ build 1500, 64-bit" > "bytea_output";"escape" > "client_encoding";"UNICODE" > "effective_cache_size";"4GB" > "lc_collate";"German_Germany.1252" > "lc_ctype";"German_Germany.1252" > "listen_addresses";"*" > "log_destination";"stderr" > "log_line_prefix";"%t " > "logging_collector";"on" > "max_connections";"100" > "max_stack_depth";"2MB" > "port";"5432" > "server_encoding";"UTF8" > "shared_buffers";"1GB" > "temp_buffers";"4096" > "TimeZone";"CET" > "work_mem";"1GB" > > > Craig Ringer wrote: >> >> On 05/26/2011 12:42 AM, panam wrote: >> It's a bit beyond me, but I suspect that it'd be best if you could hang >> onto the dump file in case someone has the time and enthusiasm to >> investigate it. I take it you can't distribute the dump file, even >> privately? >> > Fortunately, I managed to reduce it to the absolute minimum (i.e. only > meaningless ids), and the issue is still observable. > You can download it from here: > http://www.zumodrive.com/file/460997770?key=cIdeODVlNz > > Some things to try: > * tune your psql settings if you want > * reindex, vaccum analzye if you want > > "Patholgical" query: > > select > b.id, > (SELECT > m1.id > FROM > message m1 > LEFT JOIN > message m2 > ON ( > m1.box_id = m2.box_id > AND m1.id < m2.id > ) > WHERE > m2.id IS NULL > AND m1.box_id = b.id) > from > box b > > => takes almost "forever" (~600 seconds on my system) > > Try > > delete from message where id > 2550000; > > => deletes 78404 rows > Do the "pathological" query again > => speed is back (~4 seconds on my system) > > Replay the dump > Try > > delete from message where id < 1000000; > > => deletes 835844 (10 times than before) rows. Maybe you can delete many > more, I haven't tested this systematically. > Do the "pathological" query again > => takes almost "forever" (didn't wait...) > > Replay the dump > Cluster: > > cluster message_pkey on message; > > Do the "pathological" query again > => speed is back (~3 seconds on my system) > > Any third party confirmation? without explaining further why the antijoin has bad performance without cluster, I wonder why you don't use this query : SELECT b.id, max(m.id) FROM box b, message m WHERE m.box_id = b.id GROUP BY b.id; looks similar and fastest. > > Thanks > panam > > -- > View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-Anti-Join-performance-degradation-tp4420974p4428435.html > Sent from the PostgreSQL - performance mailing list archive at Nabble.com. > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ ; PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance