I thought that an index was implicitly created for foreign keys, but I see that that's not true. I've just created one now and re-ran the query but it did not change the query plan or run time. Thanks, Sean ________________________________________ From: salah jubeh [s_jubeh@xxxxxxxxx] Sent: Monday, October 29, 2012 3:18 PM To: Woolcock, Sean; pgsql-performance@xxxxxxxxxxxxxx Subject: Re: Request for help with slow query Did you try to add an index on filesystem_id ________________________________ From: "Woolcock, Sean" <Sean.Woolcock@xxxxxxx> To: "pgsql-performance@xxxxxxxxxxxxxx" <pgsql-performance@xxxxxxxxxxxxxx> Sent: Monday, October 29, 2012 6:41 PM Subject: Request for help with slow query Hi, thanks for any help. I've tried to be thorough, but let me know if I should provide more information. A description of what you are trying to achieve and what results you expect: I have a large (3 million row) table called "tape" that represents files, which I join to a small (100 row) table called "filesystem" that represents filesystems. I have a web interface that allows you to sort by a number of fields in the tape table and view the results 100 at a time (using LIMIT and OFFSET). The data only changes hourly and I do a "vacuum analyze" after all changes. The tables are defined as: create table filesystem ( id serial primary key, host varchar(256), storage_path varchar(2048) not null check (storage_path != ''), mounted_on varchar(2048) not null check (mounted_on != ''), constraint unique_fs unique(host, storage_path) ); create table tape ( id serial primary key, volser char(255) not null check (volser != ''), path varchar(2048) not null check (path != ''), scratched boolean not null default FALSE, last_write_date timestamp not null default current_timestamp, last_access_date timestamp not null default current_timestamp, filesystem_id integer references filesystem not null, size bigint not null check (size >= 0), worm_status char, encryption char, job_name char(8), job_step char(8), dsname char(17), recfm char(3), block_size int, lrecl int, constraint filesystem_already_has_that_volser unique(filesystem_id, volser) ); An example query that's running slowly for me is: select tape.volser, tape.path, tape.scratched, tape.size, extract(epoch from tape.last_write_date) as last_write_date, extract(epoch from tape.last_access_date) as last_access_date from tape inner join filesystem on (tape.filesystem_id = filesystem.id<http://filesystem.id/>) order by last_write_date desc limit 100 offset 100; On Postgres 8.1.17 this takes about 60 seconds. I would like it to be faster. Here's the explain output: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=3226201.13..3226201.38 rows=100 width=308) (actual time=66311.929..66312.053 rows=100 loops=1) -> Sort (cost=3226200.88..3234250.28 rows=3219757 width=308) (actual time=66311.826..66311.965 rows=200 loops=1) Sort Key: date_part('epoch'::text, tape.last_write_date) -> Hash Join (cost=3.26..242948.97 rows=3219757 width=308) (actual time=3.165..31680.830 rows=3219757 loops=1) Hash Cond: ("outer".filesystem_id = "inner".id) -> Seq Scan on tape (cost=0.00..178550.57 rows=3219757 width=312) (actual time=2.824..18175.863 rows=3219757 loops=1) -> Hash (cost=3.01..3.01 rows=101 width=4) (actual time=0.204..0.204 rows=101 loops=1) -> Seq Scan on filesystem (cost=0.00..3.01 rows=101 width=4) (actual time=0.004..0.116 rows=101 loops=1) Total runtime: 66553.643 ms Here's a depesz link with that output: http://explain.depesz.com/s/AUR Things I've tried: 1. I added an index on last_write_date with: create index tape_last_write_date_idx on tape(last_write_date); and there was no improvement in query time. 2. I bumped: effective_cache_size to 1/2 system RAM (1GB) shared_buffers to 1/4 system RAM (512MB) work_mem to 10MB and there was no improvement in query time. 3. I ran the query against the same data in Postgres 9.1.6 rather than 8.1.17 using the same hardware and it was about 5 times faster (nice work, whoever did that!). Unfortunately upgrading is not an option, so this is more of an anecdote. I would think the query could go much faster in either environment with some optimization. The EXACT PostgreSQL version you are running: PostgreSQL 8.1.17 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20070115 (SUSE Linux) How you installed PostgreSQL: Standard SuSE SLES 10-SP3 RPMs: postgresql-devel-8.1.17-0.3 postgresql-pl-8.1.17-0.4 postgresql-libs-8.1.17-0.3 postgresql-8.1.17-0.3 postgresql-server-8.1.17-0.3 postgresql-contrib-8.1.17-0.3 Changes made to the settings in the postgresql.conf file: Only the memory changes mentioned above. Operating system and version: Linux acp1 2.6.16.60-0.54.5-default #1 Fri Sep 4 01:28:03 UTC 2009 i686 i686 i386 GNU/Linux SLES 10-SP3 What program you're using to connect to PostgreSQL: Perl DBI Perl v5.8.8 What version of the ODBC/JDBC/ADO/etc driver you're using, if any: perl-DBD-Pg 1.43 If you're using a connection pool, load balancer or application server, which one you're using and its version: None. Is there anything remotely unusual in the PostgreSQL server logs? No, they're empty. CPU manufacturer and model: Intel Celeron CPU 440 @ 2.00GHz Amount and size of RAM installed: 2GB RAM Storage details (important for performance and corruption questions): Do you use a RAID controller? No. How many hard disks are connected to the system and what types are they? We use a single Hitachi HDT72102 SATA drive (250GB) 7200 RPM. How are your disks arranged for storage? Postgres lives on the same 100GB ext3 partition as the OS. Thanks, Sean -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx<mailto:pgsql-performance@xxxxxxxxxxxxxx>) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance