Hi
so 2. 1. 2021 v 20:07 odesílatel Joel Jacobson <joel@xxxxxxxxxxxx> napsal:
I found yet another trick, which actually seems to be slightly faster than the plpgsql version.
What version of Postgres do you use?
plpgsql in Postgres 13 is significantly faster than on older versions
Regards
Pavel
The trick is to use VIEW ... WITH (security_barrier) to tell the optimizer it shouldn’t flatten the subqueries.CREATE TABLE eastern (year integer);INSERT INTO eastern (year) SELECT generate_series(1,100000);CREATE VIEW v0 WITH (security_barrier) AS (SELECTyear,year % 19 AS g,year / 100 AS cFROM eastern);CREATE VIEW v1 WITH (security_barrier) AS (SELECTyear,g,c,(c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 AS hFROM v0);CREATE VIEW v2 WITH (security_barrier) AS (SELECTyear,c,h - (h/28)*(1 - (h/28)*(29/(h + 1))*((21 - g)/11)) AS iFROM v1);CREATE VIEW v3 WITH (security_barrier) AS (SELECTyear,i,(year + year/4 + i + 2 - c + c/4) % 7 AS jFROM v2);CREATE VIEW v4 WITH (security_barrier) AS (SELECTyear,i - j AS pFROM v3);CREATE VIEW v5 WITH (security_barrier) AS (SELECTyear,3 + (p + 26)/30 AS easter_month,1 + (p + 27 + (p + 6)/40) % 31 AS easter_dayFROM v4);EXPLAIN ANALYZE VERBOSESELECT make_date(year, easter_month, easter_day) FROM v5;QUERY PLAN-----------------------------------------------------------------------------------------------------------------------------------------------------------Subquery Scan on v5 (cost=0.00..17943.00 rows=100000 width=4) (actual time=0.025..213.996 rows=100000 loops=1)Output: make_date(v5.year, v5.easter_month, v5.easter_day)-> Subquery Scan on v4 (cost=0.00..16693.00 rows=100000 width=12) (actual time=0.024..191.448 rows=100000 loops=1)Output: v4.year, (3 + ((v4.p + 26) / 30)), (1 + (((v4.p + 27) + ((v4.p + 6) / 40)) % 31))-> Subquery Scan on v3 (cost=0.00..13443.00 rows=100000 width=8) (actual time=0.022..155.690 rows=100000 loops=1)Output: v3.year, (v3.i - v3.j)-> Subquery Scan on v2 (cost=0.00..12193.00 rows=100000 width=12) (actual time=0.021..135.485 rows=100000 loops=1)Output: v2.year, v2.i, ((((((v2.year + (v2.year / 4)) + v2.i) + 2) - v2.c) + (v2.c / 4)) % 7)-> Subquery Scan on v1 (cost=0.00..9193.00 rows=100000 width=12) (actual time=0.019..97.935 rows=100000 loops=1)Output: v1.year, v1.c, (v1.h - ((v1.h / 28) * (1 - (((v1.h / 28) * (29 / (v1.h + 1))) * ((21 - v1.g) / 11)))))-> Subquery Scan on v0 (cost=0.00..5443.00 rows=100000 width=16) (actual time=0.017..57.988 rows=100000 loops=1)Output: v0.year, v0.g, v0.c, (((((v0.c - (v0.c / 4)) - (((8 * v0.c) + 13) / 25)) + (19 * v0.g)) + 15) % 30)-> Seq Scan on public.eastern (cost=0.00..1943.00 rows=100000 width=12) (actual time=0.015..23.908 rows=100000 loops=1)Output: eastern.year, (eastern.year % 19), (eastern.year / 100)Planning Time: 0.274 msExecution Time: 220.698 ms(16 rows)EXPLAIN ANALYZE VERBOSESELECT easter_plpgsql(year) FROM eastern;QUERY PLAN------------------------------------------------------------------------------------------------------------------------Seq Scan on public.eastern (cost=0.00..26443.00 rows=100000 width=4) (actual time=0.077..301.519 rows=100000 loops=1)Output: easter_plpgsql(year)Planning Time: 0.049 msExecution Time: 309.119 ms(4 rows)On Tue, Dec 22, 2020, at 17:32, Tom Lane wrote:"Joel Jacobson" <joel@xxxxxxxxxxxx> writes:> I think I was a bit unclear about my problem, and might have used the wrong terminology.> In my LATERAL query, there are calculations in a certain order.> For each step, "columns" are computed named e.g. "g", "c", "h", "i", etc.> However, when looking at the query plan, these steps are gone, and instead there is just one huge fully expanded _expression_, which doesn't look very efficient.Yeah, this isn't really about function inlining, it's about subqueryflattening (which is similar in some ways, but not the same thing).Unfortunately, subquery flattening happens early enough in the plannerthat there's no chance of making any useful cost comparisons to decidewhether to do it or not. So we just do it unconditionally. I'mnot really sure that failing to do it would provide a better outcomein this situation anyway --- sure, you'd save a few scalar calculations,but the overhead of running additional plan nodes could outweigh that.The long and the short of it is that SQL isn't terribly well suited toexecute a fundamentally stepwise, imperative algorithm like this one.Rather than hacking up cute tricks with LATERAL, you should just usea language that *is* well suited. That's why we provide PLs.FWIW, another trick for inserting optimization fences is WITH.So you could do something likeWITH Q1(g,c) AS MATERIALIZED(SELECT year % 19, year / 100),Q2(h) AS MATERIALIZED(SELECT (c - c/4 - (8*c + 13)/25 + 19*g + 15) % 30 FROM Q1),...SELECT make_date(year, easter_month, easter_day) FROM Q6;But I'd bet lunch that that won't be faster for this example,because there's a lot of overhead in CTEs.regards, tom laneKind regards,Joel