Re: Slow query: table iteration (8.3)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 4, 2010 at 4:09 AM, Glenn Maynard <glenn@xxxxxxxx> wrote:
> But I'll be dropping this db into 8.4 soon to see if it helps
> anything, and I'll check again (and if it's still slow I'll post more
> details).  It's been a while and I might just have been doing
> something wrong.

Windowing doesn't want to scale for this case.  I'll simplify to give
an actual test case.

create table test_users (id serial primary key);
insert into test_users (id) select generate_series(1, 1000);
create table test (id serial primary key, score integer, user_id integer);
insert into test (user_id, score) select s.id, random() * 1000000 from
(select generate_series(1, 1000) as id) as s, generate_series(1,
1000);
create index test_1 on test (score);
create index test_2 on test (user_id, score desc);
analyze;

This generates a thousand users, with a thousand scores each.  This
finds the top score for each user (ignoring the detail of duplicate
scores; easy to deal with):

SELECT sub.id FROM (
    SELECT t.id, rank() OVER (PARTITION BY t.user_id ORDER BY score
DESC) AS rank
    FROM test t
) AS sub WHERE rank <= 1;

This does use the test_2 index (as intended), but it's still very
slow: 11 seconds on my system.

It seems like it's doing a *complete* scan of the index, generating
ranks for every row, and then filters out all but the first of each
rank.  That means it scales linearly with the total number of rows.
All it really needs to do is jump to each user in the index and pull
out the first entry (according to the "score desc" part of the test_2
index), which would make it scale linearly with the number of users.

The function version:

CREATE FUNCTION high_score_for_user(user_id int) RETURNS SETOF INTEGER
LANGUAGE SQL AS $$
       SELECT t.id FROM test t
       WHERE t.user_id = $1
       ORDER BY t.score DESC LIMIT 1
$$;
SELECT high_score_for_user(u.id) FROM test_users u;

runs in 100ms.

I think I'm stuck with either creating temporary functions with the
constants already replaced, or creating an SQL function that evaluates
a brand new query as a string as Yeb suggested.

-- 
Glenn Maynard

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux