Search Postgresql Archives

Re: Slow first query despite LIMIT and OFFSET clause

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 29, 2009, at 1:35 AM, Phoenix Kiula wrote:

On Thu, Jan 29, 2009 at 2:25 AM, Alban Hertroys
<dalroi@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

Ah I see, that's the original query and its plan again, not the one after implementing those triggers! You had me scratching my head for a bit there,
wondering why the count() subquery was still there.

Yes that was just for info. Here are the new query without the count()
in there:


explain analyze SELECT
         testimonials.url
         ,testimonials.alias
         ,testimonials.aliasEntered
         ,testimonials.title
         ,testimonials.modify_date
         ,testimonials.id
         ,visitcount.visit_count
         ,visitcount.unique_count
         ,visitcount.modify_date
         ,coalesce(  extract(epoch from now()) -  extract(epoch from
visitcount.modify_date), 0)
 FROM testimonials
 LEFT OUTER JOIN visitcount USING (id)
 WHERE
              testimonials.user_id = 'superman'
      and testimonials.user_known = 1
      and testimonials.status = 'Y'
 ORDER BY testimonials.modify_date desc
 OFFSET 0 LIMIT 10
;


QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=61.42..61.44 rows=10 width=162) (actual
time=105.400..105.499 rows=10 loops=1)
  ->  Sort  (cost=61.42..61.46 rows=16 width=162) (actual
time=105.392..105.425 rows=10 loops=1)
        Sort Key: testimonials.modify_date
        ->  Nested Loop Left Join  (cost=0.00..61.10 rows=16
width=162) (actual time=0.092..94.516 rows=2027 loops=1)
              ->  Index Scan using new_idx_userknown on testimonials
(cost=0.00..24.29 rows=16 width=146) (actual time=0.058..10.983
rows=2027 loops=1)
                    Index Cond: ((user_id)::text = 'superman'::text)
                    Filter: (status = 'Y'::bpchar)
              ->  Index Scan using visitcount_pkey1 on visitcount
(cost=0.00..2.28 rows=1 width=24) (actual time=0.024..0.026 rows=1
loops=2027)
                    Index Cond: (testimonials.id = visitcount.id)
Total runtime: 105.652 ms
(10 rows)

Note that I have an index on user_id, but because this is a website,
there are several user_ids where we only have the IP. The above query
is only ever needed for registered users, so for just the registered
users we created another partial index called

    "new_idx_userknown" btree (user_id) WHERE user_known = 1

Of course for unregistered users we use user_known = 0, so they are
excluded from this index. Is this not a useful partial index? I think
in this SQL, the user_id is always "superman" and the user_known
always 1 which is why the guesstimate from the planner may be off?

Love to hear thoughts.


Well, that seems to have got you rid of the somewhat expensive index scans on new_idx_userknown as well (the duplicate entry for the scan being due to the subquery of course).

What's remaining is the left join.
If I understand correctly you have a PK on visitcount.id and that table only contains records for people who have a visitcount > 0? That table gets updated a lot I'd think? The query plan still shows a bad estimate on that join; it has improved, but not enough.

Does the plan look better right after you ANALYSE visitcount? I'm suspecting you either need to autovacuum visitcount more frequently or you need to increase the statistics size on visitcount.id. You're updating that table a lot I think, which creates one new dead row for every update. Letting vacuum mark the dead ones as reusable more frequently should also help keep that table and it's indexes cleaner, although the records the indexes are pointing to will be all over the place.



I'm wondering... In highly updated tables it's probably more efficient to leave the dead rows alone (just marking them dead) and only append the updated ones at the end of the table? The dead rows will accumulate at the start of the table while the new ones go to the end. After a while a large section of the start of the table could just be removed as it'd only contain dead rows... This may already be in place of course, I don't have time now to look into the design specifics and it seems kind of an obvious thing to do!

Regards,
Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,49816122747034095710041!



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux