Hello everyone, > Apparently, the planner isn't reusing the data boundaries across alternative > plans. It would be nicer if the planner remembered each column boundaries > for later reuse (within the same planner execution). I've written a very naive (and crappy) patch to show how adding memorization to get_actual_variable_range() could help the planner on scenarios with a big number of joins. For the previous example, > explain (analyze, buffers) > select * from a > join b b1 on (b1.a = a.a) > join b b2 on (b2.a = a.a) > where b1.a in (1,100,10000,1000000,1000001); each time you add a join clause the planner has to read an extra ~5[K] buffers and gets about 200[ms] slower. 1 join Planning: Buffers: shared hit=9 read=27329 Planning Time: 101.745 ms Execution Time: 0.082 ms 2 joins Planning: Buffers: shared hit=42 read=81988 Planning Time: 303.237 ms Execution Time: 0.102 ms 3 joins Planning: Buffers: shared hit=94 read=136660 Planning Time: 508.947 ms Execution Time: 0.155 ms 4 joins Planning: Buffers: shared hit=188 read=191322 Planning Time: 710.981 ms Execution Time: 0.168 ms After adding memorization the cost in buffers remains constant and the latency deteriorates only marginally (as expected) with each join. 1 join Planning: Buffers: shared hit=10 read=27328 Planning Time: 97.889 ms Execution Time: 0.066 ms 2 joins Planning: Buffers: shared hit=7 read=27331 Planning Time: 100.589 ms Execution Time: 0.111 ms 3 joins Planning: Buffers: shared hit=9 read=27329 Planning Time: 105.669 ms Execution Time: 0.134 ms 4 joins Planning: Buffers: shared hit=132 read=27370 Planning Time: 155.716 ms Execution Time: 0.219 ms I'd be happy to improve this patch into something better. Though I'd like suggestions on how to do it: I have this idea of creating a local "memorization" struct instance within standard_planner(). That would require passing on a pointer down until it reaches get_actual_variable_range(), which I think would be quite ugly, if done just to improve the planner for this scenario. Is there any better mechanism I could reuse from other modules? (utils or cache, for example). Regards, Manuel
Attachment:
actual_variable_range_memorization.diff
Description: Binary data