Re: Slow-ish Query Needs Some Love

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/3/2010 11:17 AM, Matt White wrote:
On Feb 2, 1:11 pm, a...@xxxxxxxxxxxxxxx (Andy Colson) wrote:
On 2/2/2010 1:03 PM, Matt White wrote:





On Feb 2, 6:06 am, Edgardo Portal<egportal2...@xxxxxxxxx>    wrote:
On 2010-02-02, Matt White<mattw...@xxxxxxxxx>    wrote:

I have a relatively straightforward query that by itself isn't that
slow, but we have to run it up to 40 times on one webpage load, so it
needs to run much faster than it does. Here it is:

SELECT COUNT(*) FROM users, user_groups
   WHERE users.user_group_id = user_groups.id AND NOT users.deleted AND
user_groups.partner_id IN
   (partner_id_1, partner_id_2);

The structure is partners have user groups which have users. In the
test data there are over 200,000 user groups and users but only ~3000
partners. Anyone have any bright ideas on how to speed this query up?

Can you avoid running it 40 times, maybe by restructuring the
query (or making a view) along the lines of the following and
adding some logic to your page?

SELECT p.partner_id, ug.user_group_id, u.id, count(*)
    FROM partners p
         LEFT JOIN user_groups ug
                ON ug.partner_id=p.partner_id
         LEFT JOIN users u
                ON u.user_group_id=ug.id
   WHERE NOT u.deleted
   GROUP BY 1,2,3
;

Thanks for the suggestion. The view didn't seem to speed things up.
Perhaps we can reduce the number of times it's called, we'll see. Any
additional ideas would be helpful. Thanks.

I agree with Edgardo, I think the biggest time saver will be reducing
trips to the database.

But... do you have an index on users.user_group_id?

Does rewriting it change the plan any?

SELECT COUNT(*) FROM users
inner join user_groups on (users.user_group_id = user_groups.id)
where NOT users.deleted
AND user_groups.partner_id IN (partner_id_1, partner_id_2);

And... it looks like the row guestimate is off a litte:

Index Scan using user_groups_partner_id_idx
on user_groups
(cost=0.00..133.86 rows=3346 width=8)
(actual time=0.049..96.992 rows=100001 loops=2)

It guessed 3,346 rows, but actually got 100,001.  Have you run an
analyze on it?  If so, maybe bumping up the stats might help?

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performa...@xxxxxxxxxxxxxx)
To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance

Andy,

I have run analyze, see my query plan in my original post. You'll have
to forgive me for being a bit of a Postgres noob but what do you mean
by "bumping up the stats"?

Thats not what I mean. "explain analyze select..." is what you did, and correct. What I meant was "analyze user_groups".

see:
http://www.postgresql.org/docs/8.4/interactive/sql-analyze.html


an analyze will make PG look at a table, and calc stats on it, so it can make better guesses. By default analyze only looks at a few rows (well a small percent of rows) and makes guesses about the entire table based on those rows. If it guesses wrong, sometimes you need to tell it to analyze more rows (ie. a bigger percentage of the table).

By "bumping the stats" I was referring to this:

http://wiki.postgresql.org/wiki/Planner_Statistics

I have never had to do it, so dont know much about it. It may or may not help. Just thought it was something you could try.

-Andy


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux