Search Postgresql Archives

Re: Query planner question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23 Aug 2014, at 4:34, Soni M <diptatapa@xxxxxxxxx> wrote:
> On Fri, Aug 22, 2014 at 9:10 PM, Alban Hertroys <haramrae@xxxxxxxxx> wrote:
> On 22 August 2014 14:26, Soni M <diptatapa@xxxxxxxxx> wrote:
> > Currently we have only latest_transmission_id as FK, described here :
> > TABLE "ticket" CONSTRAINT "fkcbe86b0c6ddac9e" FOREIGN KEY
> > (latest_transmission_id) REFERENCES transmission_base(transmission_id)
> >
> > Change the query to include only FK still result the same:
> > explain select t.ticket_id ,
> > tb.transmission_id
> > from ticket t,
> > transmission_base tb
> > where t.latest_transmission_id = tb.transmission_id
> > and tb.parse_date > ('2014-07-31');
> >                                                             QUERY PLAN
> > ----------------------------------------------------------------------------------------------------------------------------------
> >  Hash Join  (cost=113928.06..2583606.96 rows=200338 width=8)
> >    Hash Cond: (t.latest_transmission_id = tb.transmission_id)
> >    ->  Seq Scan on ticket t  (cost=0.00..1767767.26 rows=69990826 width=8)
> >    ->  Hash  (cost=108923.38..108923.38 rows=400374 width=4)
> >          ->  Index Scan using transmission_base_by_parse_date on
> > transmission_base tb  (cost=0.00..108923.38 rows=400374 width=4)
> >                Index Cond: (parse_date > '2014-07-31 00:00:00'::timestamp
> > without time zone)
> > (6 rows)
> 
> Do you have an index on ticket (latest_transmission_id)?
> 
> Yes, both t.latest_transmission_id and tb.transmission_id is indexed.
> 
> Indexes:
>     "transmission_base_pkey" PRIMARY KEY, btree (transmission_id) CLUSTER
> Indexes:
>     "ticket_by_latest_transmission" btree (latest_transmission_id)

Okay, so we got those indexes. So much for the low-hanging fruit.

>From the above plan we learn that the database estimates[1] that 400k rows from transmission match your condition (parse_date > '2014-07-31’). The ticket table has a foreign key to that table, which suggests a 1:n relationship. It also has around 70M rows, or at least the database seems to think that about that amount will match those 400k transmissions.

That means that if on average 175 (=70M/400k) ticket ID’s match a transmission ID, the database would be needing all those 70M rows anyway - and even if it only needs every 175th row, a sequential scan is not a particularly inefficient way to go about this.
The alternative is a whole lot of index lookups, probably not in the same order as either the index or the rows on disk, meaning quite a bit of random disk I/O.

I’m suspecting that the cost estimates for this query with seq-scans disabled aren’t very different, provided doing so comes up with a comparable plan.

Things you might want to verify/try:
* Are those estimated numbers of rows accurate? If not, is autovacuum (or scheduled vacuum) keeping up with the amount of data churn on these tables? Do you collect a sufficiently large sample for the statistics?
* How much bloat is in these tables/indexes?
* Did you change planner settings (such as disabling bitmap scans; I kind of expected one here) or did you change cost estimate parameters?
* Does it help to put an index on transmission (parse_date, transmission_id)?
* If none of that helps, we’re going to need the output of explain analyze - that will probably take long to create, so you might as well start with that and do the other stuff at the side.

What kind of hardware are these disks on? Is it possible that disk I/O on this particular machine is relatively slow (relative to the seq/random cost factor for disk access as specified in your postgresql.conf)?

Cheers,

Alban Hertroys

[1] You did not provide explain analyse output, so we only have estimates to work with.
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux