On Wed, Nov 30, 2016 at 6:26 AM, Marc-Olaf Jaschke <marc-olaf.jaschke@xxxxxxx> wrote: > Hi, > > i have a performance issue with bitmap index scans on huge amounts of big jsonb documents. > > > ===== Background ===== > > - table with big jsonb documents > - gin index on these documents > - queries using index conditions with low selectivity > > > ===== Example ===== > > select version(); >> PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-17), 64-bit > > show work_mem; >> 1GB > > -- setup test data > create table bitmap_scan_test as > select > i, > (select jsonb_agg(jsonb_build_object('x', i % 2, 'filler', md5(j::text))) from generate_series(0, 100) j) big_jsonb > from > generate_series(0, 100000) i; > > create index on bitmap_scan_test using gin (big_jsonb); > > analyze bitmap_scan_test; > > > -- query with bitmap scan > explain analyze > select > count(*) > from > bitmap_scan_test > where > big_jsonb @> '[{"x": 1, "filler": "cfcd208495d565ef66e7dff9f98764da"}]'; > > Aggregate (cost=272.74..272.75 rows=1 width=8) (actual time=622.272..622.272 rows=1 loops=1) > -> Bitmap Heap Scan on bitmap_scan_test (cost=120.78..272.49 rows=100 width=0) (actual time=16.496..617.431 rows=50000 loops=1) > Recheck Cond: (big_jsonb @> '[{"x": 1, "filler": "cfcd208495d565ef66e7dff9f98764da"}]'::jsonb) > Heap Blocks: exact=637 > -> Bitmap Index Scan on bitmap_scan_test_big_jsonb_idx (cost=0.00..120.75 rows=100 width=0) (actual time=16.371..16.371 rows=50000 loops=1) > Index Cond: (big_jsonb @> '[{"x": 1, "filler": "cfcd208495d565ef66e7dff9f98764da"}]'::jsonb) > Planning time: 0.106 ms > Execution time: 622.334 ms > > > perf top -p... shows heavy usage of pglz_decompress: > > Overhead Shared Object Symbol > 51,06% postgres [.] pglz_decompress > 7,33% libc-2.12.so [.] memcpy Another thing to possibly look at is configuring the column not to compress; over half the time is spent decompressing the data. See: ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN } Naturally this is a huge tradeoff so do some careful analysis before making the change. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance