Re: Can this query go faster???

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 04:43 AM 12/6/2005, Joost Kraaijeveld wrote:
Hi,

Is it possible to get this query run faster than it does now, by adding
indexes, changing the query?

SELECT customers.objectid FROM prototype.customers, prototype.addresses
WHERE
customers.contactaddress = addresses.objectid
ORDER BY zipCode asc, housenumber asc
LIMIT 1 OFFSET 283745

Explain:

Limit  (cost=90956.71..90956.71 rows=1 width=55)
  ->  Sort  (cost=90247.34..91169.63 rows=368915 width=55)
        Sort Key: addresses.zipcode, addresses.housenumber
        ->  Hash Join  (cost=14598.44..56135.75 rows=368915 width=55)
              Hash Cond: ("outer".contactaddress = "inner".objectid)
              ->  Seq Scan on customers  (cost=0.00..31392.15
rows=368915 width=80)
              ->  Hash  (cost=13675.15..13675.15 rows=369315 width=55)
                    ->  Seq Scan on addresses  (cost=0.00..13675.15
rows=369315 width=55)

The customers table has an index on contactaddress and objectid.
The addresses table has an index on zipcode+housenumber and objectid.

TIA
customer names, customers.objectid, addresses, and addresses.objectid should all be static (addresses do not change, just the customers associated with them; and once a customer has been assigned an id that better never change...).

To me, this sounds like the addresses and customers tables should be duplicated and then physically laid out in sorted order by <tablename>.objectid in one set and by the "human friendly" associated string in the other set. Then a finding a specific <tablename>.objectid or it's associated string can be done in at worse O(lgn) time assuming binary search instead of O(n) time for a sequential scan. If pg is clever enough, it might be able to do better than that.

IOW, I'd try duplicating the addresses and customers tables and using the appropriate CLUSTERed Index on each.

I know this breaks Normal Form. OTOH, this kind of thing is common practice for data mining problems on static or almost static data.

Hope this is helpful,
Ron




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux