How do query optimizers affect window functions

Tianyin Xu <tixu@xxxxxxxxxxx> · Wed, 14 Nov 2012 00:12:42 -0800

Hi, Postgresql,

I want to understand how the query optimizers affect the output of the window functions.

For example, set "cpu_tuple_cost = 50" in postgresql.conf and start the server, I apply the regress test (make installcheck). The test of window function fails. 

Checking the diff and I found the output of the window functions are different. For example, 

For the following query:

SELECT sum(unique1) over (rows between current row and unbounded following), unique1, four FROM tenk1 WHERE unique1 < 10;

The expected results are:

 sum | unique1 | four
-----+---------+------
  45 |       4 |    0
  41 |       2 |    2
  39 |       1 |    1
  38 |       6 |    2
  32 |       9 |    1
  23 |       8 |    0

  15 |       5 |    1
  10 |       3 |    3
   7 |       7 |    3
   0 |       0 |    0

But the real results are:

 sum | unique1 | four 
-----+---------+------
  45 |       0 |    0    
  45 |       1 |    1    

  44 |       2 |    2    
  42 |       3 |    3    
  39 |       4 |    0    
  35 |       5 |    1    
  30 |       6 |    2    
  24 |       7 |    3    
  17 |       8 |    0    
   9 |       9 |    1    

There're altogether 6 queries in window test that outputs different query results.

I don't understand why the results are different. Intuitively, the queries show return the same results no matter what plan the optimizer choose. 

I suspected the previous queries had some side effect on the latter one (e.g., change the current row), so I removed all the previous queries before this query in window.sql. But the result did not change.

Could anyone explain this behavior? Or point out how to investigate?

Thanks a lot!
Tianyin