Re: cube operations slower than geo_distance() on production server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/12/07, Mark Stosberg <mark@xxxxxxxxxxxxxxx> wrote:
Merlin Moncure wrote:
> On 2/10/07, Mark Stosberg <mark@xxxxxxxxxxxxxxx> wrote:
>>
>> With the help of some of this list, I was able to successfully  set up
>> and benchmark a cube-based replacement for geo_distance() calculations.
>>
>> On a development box, the cube-based variations benchmarked consistently
>> running in about 1/3 of the time of the gel_distance() equivalents.
>>
>> After setting up the same columns and indexes on a production
>> database, it's a different story. All the cube operations show
>> themselves to be about the same as, or noticeably slower than, the same
>> operations done with geo_distance().
>>
>> I've stared at the EXPLAIN ANALYZE output as much I can to figure what's
>> gone. Could you help?
>>
>> Here's the plan on the production server, which seems too slow. Below
>> is the plan I get in
>> on the development server, which is much faster.
>>
>> I tried "set enable_nestloop = off", which did change the plan, but
>> the performance.
>>
>> The production DB has much more data in it, but I still expected
>> comparable results relative
>> to using geo_distance() calculations.
>
> any objection to posting the query (any maybe tables, keys, indexes, etc)?

Here the basic query I'm using:
SELECT
 -- 1609.344 is a constant for "meters per mile"
 cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
'90210') , earth_coords)/1609.344
   AS RADIUS
   FROM pets
   -- "shelters_active" is a view where "shelter_state = 'active'"
   JOIN shelters_active as shelters USING (shelter_id)
   -- The zipcode fields here are varchars
   JOIN zipcodes ON (
        shelters.postal_code_for_joining = zipcodes.zipcode )
   -- search for just 'dogs'
   WHERE species_id = 1
       AND pet_state='available'
      AND earth_box(
        (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
10*1609.344
      ) @ earth_coords
   ORDER BY RADIUS;

your query looks a bit funky. here are the problems I see.

* in your field list, you don't need to re-query the zipcode table.
 cube_distance( (SELECT earth_coords from zipcodes WHERE zipcode =
'90210') , earth_coords)/1609.344 AS RADIUS

becomes

cube_distance(pets.earth_coords, earth_coords ) / 1609.344 AS RADIUS

also, dont. re-refer to the zipcodes table in the join clause. you are
already joining to it:
      AND earth_box(
        (SELECT earth_coords from zipcodes WHERE zipcode = '90210') ,
10*1609.344) @ earth_coords

becomes

 AND earth_box(zipcodes.earth_coords, 10*1609.344) ) @ pets.earth_coords

* also, does pet_state have any other states than 'available' and  '
not available'? if not, you should be using a boolean. if so, you can
consider a functional index to convert it to a booelan.

* if you always look up pets by species, we can explore composite
index columns on species, available (especially using the above
functional suggestion), etc.  composite > partial (imo)

thats just to start. play with it and see what comes up.

merlin


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux