Re: LIMIT OFFSET with DB view vs plain SQL

Raj Gandhi <raj01gandhi@xxxxxxxxx> · Mon, 1 Apr 2019 22:06:11 -0400

Thanks Rui. The performance of using function is close to the plain SQL.

Why Query planner is choosing different path with DB view?

explain analyze select foo(101,0);
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 ProjectSet  (cost=0.00..5.27 rows=1000 width=32) (actual time=10.340..10.374 rows=101 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=1)
 Planning time: 0.035 ms
 Execution time: 10.436 ms
(4 rows)

On Mon, Apr 1, 2019 at 4:14 PM Rui DeSousa <rui@xxxxxxxxxxxxx> wrote:
Try using a function that returns the result set.
i.e. 

create or replace function foo(_limit int, _offset int)
  returns setof sample_table
as $$
begin
  return query
    select * 
    from sample_table 
    order by created_date
    limit _limit 
    offset _offset
  ;
end;
$$ language plpgsql
  volatile
;  

Given your query; return a table instead of a set. i.e.: 

returns table (
   id int
  , parent_id int
  .
  .
  . 
  , response_body text
)
as $$

Query example: 

select * from foo(100, 50);

On Apr 1, 2019, at 9:56 AM, Raj Gandhi <raj01gandhi@xxxxxxxxx> wrote:

Any other idea how to resolve the performance issue with the database view?

On Fri, Mar 29, 2019 at 7:38 PM Raj Gandhi <raj01gandhi@xxxxxxxxx> wrote:
Merlin,  I tried the hack you suggested but that didn't work. Planner used the same path.

The same query works much faster when using the raw SQL instead of  DB view:

Here is the definition of DB View ‘job’
 SELECT w.id,
    w.parent_id,
    w.status AS state,
    w.percent_complete
AS progress_percentage,
    w.start_time,
    w.end_time,

w.est_completion_time AS estimated_completion_time,
    w.root_id,
    w.internal AS is_internal,
    w.order_id AS
step_order,
    c.resource_type,
    c.resource_id,
    c.id AS
command_id,
    c.client_cookie,
    c.user_name AS
"user",
    c.metadata,
    c.client_address,
    response_body(r.*,
w.*) AS response_body
   FROM work_unit w
     LEFT JOIN command
c ON c.work_unit_id = w.id
     LEFT JOIN
command_response r ON r.command_id::text = c.id::text;

Query that uses the DB view:
SELECT id, start_time
FROM job
order by id  LIMIT 101
OFFSET 0;

Explain plan: https://explain.depesz.com/s/gzjQ

 Query using the raw SQL
<SQL from Job DB View definition>
ORDER BY id LIMIT 101 OFFSET 0;

Explain plan:https://explain.depesz.com/s/KgwO

On Fri, Mar 29, 2019 at 11:26 AM Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
On Thu, Mar 28, 2019 at 5:44 PM Raj Gandhi <raj01gandhi@xxxxxxxxx> wrote:

>

> + pgsql-performance

>

> On Thu, Mar 28, 2019 at 6:41 PM Raj Gandhi <raj01gandhi@xxxxxxxxx> wrote:

>>

>> Hi everyone,

>>

>>

>>

>> I’m using LIMIT offset with DB view. Looks like query planner is applying the LIMIT for DB view at the end after processing all rows.

>>

>> When running same SQL that was used to create the DB view, LIMIT is applied earlier so the query is much faster.

>>

>>

>>

>> Explain plan using DB view

>>

>> https://explain.depesz.com/s/gzjQ

>>

>>

>>

>> Explain plan using raw SQL

>>

>> https://explain.depesz.com/s/KgwO

>>

>>

>>

>> In both tests LIMIT was 100 with offset  = 0.

>>

>> Is there any way to force DB view to apply limit earlier?

huh. OFFSET does indeed force a materialize plan.   This is a widely

used tactic to hack the planner ('OFFSET 0').

Maybe try converting your query from something like:

SELECT * FROM foo LIMIT m OFFSET N;

to

WITH data AS

(

  SELECT * FROM foo LIMIT m + n

)

SELECT * FROM foo OFFSET n;

I didn't try this, and it may not help, but it's worth a shot.

merlin