On 2018-07-30 07:19:07 -0400, Jeff Janes wrote: > On Mon, Jul 30, 2018 at 12:11 AM, Pavel Stehule <pavel.stehule@xxxxxxxxx> > wrote: > > > 2018-07-30 1:00 GMT+02:00 Tom Lane <tgl@xxxxxxxxxxxxx>: > > > >> David Rowley <david.rowley@xxxxxxxxxxxxxxx> writes: > >> > On 29 July 2018 at 17:38, Dinesh Kumar <dns98944@xxxxxxxxx> wrote: > >> >> I found performance variance between accessing int1 and int200 column > >> which > >> >> is quite large. > >> > >> > Have a look at slot_deform_tuple and heap_deform_tuple. You'll see > >> > that tuples are deformed starting at the first attribute. If you ask > >> > for attribute 200 then it must deform 1-199 first. > >> > >> Note that that can be optimized away in some cases, though evidently > >> not the one the OP is testing. From memory, you need a tuple that > >> contains no nulls, and all the columns to the left of the target > >> column have to be fixed-width datatypes. Otherwise, the offset to > >> the target column is uncertain, and we have to search for it. > >> > > > > JIT decrease a overhead of this. > > > > The bottleneck here is such a simple construct, I don't see how JIT could > improve it by much. The deparsing can become quite a bit faster with JITing, because we know the column types and width. If intermittent columns are NOT NULL and fixed width, we can even optimize processing them at runtime nearly entirely. > And indeed, in my hands JIT makes it almost 3 times worse. Not in my measurement. Your example won't use JIT at all, because it's below the cost threshold. So I think you might just be seeing cache + hint bit effects? > Run against ab87b8fedce3fa77ca0d6, I get 12669.619 ms for the 2nd JIT > execution and 4594.994 ms for the JIT=off. Even with a debug LLVM build, which greatly increases compilation overhead, I actually see quite the benefit when I force JIT to be used: postgres[26832][1]=# ;SET jit_above_cost = -1; set jit_optimize_above_cost = 0; set jit_inline_above_cost = 0; postgres[26832][1]=# explain (analyze, buffers, timing off) select pk, int200 from i200c200; ┌───────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Seq Scan on i200c200 (cost=0.00..233332.28 rows=9999828 width=16) (actual rows=10000000 loops=1) │ │ Buffers: shared hit=133334 │ │ Planning Time: 0.069 ms │ │ Execution Time: 3645.069 ms │ └───────────────────────────────────────────────────────────────────────────────────────────────────┘ (4 rows) postgres[26832][1]=# ;SET jit_above_cost = 0; set jit_optimize_above_cost = 0; set jit_inline_above_cost = 0; postgres[26832][1]=# explain (analyze, buffers, timing off) select pk, int200 from i200c200; ┌───────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Seq Scan on i200c200 (cost=0.00..233332.28 rows=9999828 width=16) (actual rows=10000000 loops=1) │ │ Buffers: shared hit=133334 │ │ Planning Time: 0.070 ms │ │ JIT: │ │ Functions: 2 │ │ Inlining: true │ │ Optimization: true │ │ Execution Time: 3191.683 ms │ └───────────────────────────────────────────────────────────────────────────────────────────────────┘ (8 rows) Now that's not *huge*, but nothing either. And it's a win even though JITing takes it good own time (we need to improve on that). If I force all the bigint columns to be NOT NULL DEFAULT 0 the results get more drastic: postgres[28528][1]=# ;SET jit_above_cost = 0; set jit_optimize_above_cost = 0; set jit_inline_above_cost = 0; ┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Seq Scan on i200c200 (cost=0.00..2600000.00 rows=10000000 width=16) (actual rows=10000000 loops=1) │ │ Buffers: shared hit=2500000 │ │ Planning Time: 0.066 ms │ │ JIT: │ │ Functions: 2 │ │ Inlining: true │ │ Optimization: true │ │ Execution Time: 4837.872 ms │ └─────────────────────────────────────────────────────────────────────────────────────────────────────┘ (8 rows) postgres[28528][1]=# ;SET jit_above_cost = -1; set jit_optimize_above_cost = 0; set jit_inline_above_cost = 0; postgres[28528][1]=# explain (analyze, buffers, timing off) select pk, int200 from i200c200; ┌─────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Seq Scan on i200c200 (cost=0.00..2600000.00 rows=10000000 width=16) (actual rows=10000000 loops=1) │ │ Buffers: shared hit=2500000 │ │ Planning Time: 0.067 ms │ │ Execution Time: 8192.236 ms │ └─────────────────────────────────────────────────────────────────────────────────────────────────────┘ that's because the JITed version essentially now boils down to a near optimal loop around the intermittent bigint columns (which we deform because we use a slot - at some point we're going to have to do better). No checks for the NULL bitmap, no alignment considerations, all that's optimized away. Greetings, Andres Freund