Search Postgresql Archives

Re: Overlapping ranges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2014-06-18 at 18:08 -0600, Rob Sargent wrote:
On 06/18/2014 05:47 PM, Jason Long wrote:

I have a large table of access logs to an application.  

I want is to find all rows that overlap startdate and enddate with any
other rows.

The query below seems to work, but does not finish unless I specify a
single id.  

select distinct a1.id
from t_access a1, 
        t_access a2 
where tstzrange(a1.startdate, a1.enddate) && 
      tstzrange(a2.startdate, a2.enddate) 




I'm sure you're best bet is a windowing function, but your descriptions suggests there is no index on start/end date columns.  Probably want those in any event.

There are indexs on startdate and enddate.
If I specify a known a1.id=1234 then the query returns all records that overlap it, but this takes 1.7 seconds.

There are about 2 million records in the table.

I will see what I come up with on the window function.

If anyone else has some suggestions let me know.

I get with for EXPLAIN ANALYZE the id specified.

Nested Loop  (cost=0.43..107950.50 rows=8825 width=84) (actual time=2803.932..2804.558 rows=11 loops=1)
   Join Filter: (tstzrange(a1.startdate, a1.enddate) && tstzrange(a2.startdate, a2.enddate))
   Rows Removed by Join Filter: 1767741
   ->  Index Scan using t_access_pkey on t_access a1  (cost=0.43..8.45 rows=1 width=24) (actual time=0.016..0.019 rows=1 loops=1)
         Index Cond: (id = 1928761)
   ->  Seq Scan on t_access a2  (cost=0.00..77056.22 rows=1764905 width=60) (actual time=0.006..1200.657 rows=1767752 loops=1)
         Filter: (enddate IS NOT NULL)
         Rows Removed by Filter: 159270
Total runtime: 2804.599 ms


and for EXPLAIN without the id specified.  EXPLAIN ANALYZE will not complete without the id specified.

Nested Loop  (cost=0.00..87949681448.20 rows=17005053815 width=84)
   Join Filter: (tstzrange(a1.startdate, a1.enddate) && tstzrange(a2.startdate, a2.enddate))
   ->  Seq Scan on t_access a2  (cost=0.00..77056.22 rows=1764905 width=60)
         Filter: (enddate IS NOT NULL)
   ->  Materialize  (cost=0.00..97983.33 rows=1927022 width=24)
         ->  Seq Scan on t_access a1  (cost=0.00..77056.22 rows=1927022 width=24)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux