Re: Query performance issue

Nagaraj Raj <nagaraj.sf@xxxxxxxxx> · Fri, 4 Sep 2020 22:20:06 +0000 (UTC)

        Hi Mechel,

I added the index as you suggested and the planner going through the bitmap index scan,heap and the new planner is,
HaOx | explain.depesz.com

HaOx | explain.depesz.com

Mem config: 

Aurora PostgreSQL 11.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.9.3, 64-bit
vCPU = 64
RAM = 512
show shared_buffers = 355 GB
show work_mem = 214 MB
show maintenance_work_mem = 8363MB
show effective_cache_size = 355 GB

Thanks,
Rj

                    On Friday, September 4, 2020, 02:55:50 PM PDT, Michael Lewis <mlewis@xxxxxxxxxxx> wrote:

                "Subquery Scan on rec  (cost=1628601.89..1676580.92 rows=7381 width=41) (actual time=22171.986..23549.079 rows=1236042 loops=1)"
"  Filter: (rec.mpos = 1)"
"  Rows Removed by Filter: 228737"
"  Buffers: shared hit=45 read=1166951"
"  I/O Timings: read=29.530"
"  ->  WindowAgg  (cost=1628601.89..1658127.45 rows=1476278 width=49) (actual time=22171.983..23379.219 rows=1464779 loops=1)"
"        Buffers: shared hit=45 read=1166951"
"        I/O Timings: read=29.530"
"        ->  Sort  (cost=1628601.89..1632292.58 rows=1476278 width=41) (actual time=22171.963..22484.044 rows=1464779 loops=1)"
"              Sort Key: receiving_item_delivered_received.serial_no, receiving_item_delivered_received.eventtime DESC"
"              Sort Method: quicksort  Memory: 163589kB"
"              Buffers: shared hit=45 read=1166951"
"              I/O Timings: read=29.530"
"              ->  Gather  (cost=1000.00..1477331.13 rows=1476278 width=41) (actual time=1.296..10428.060 rows=1464779 loops=1)"
"                    Workers Planned: 2"
"                    Workers Launched: 2"
"                    Buffers: shared hit=39 read=1166951"
"                    I/O Timings: read=29.530"
"                    ->  Parallel Seq Scan on receiving_item_delivered_received  (cost=0.00..1328703.33 rows=615116 width=41) (actual time=1.262..10150.325 rows=488260 loops=3)"
"                          Filter: (((COALESCE(serial_no, ''::character varying))::text <> ''::text) AND ((eventtype)::text = 'LineItemdetailsReceived'::text) AND ((replenishmenttype)::text = 'DC2SWARRANTY'::text))"
"                          Rows Removed by Filter: 6906258"
"                          Buffers: shared hit=39 read=1166951"
"                          I/O Timings: read=29.530"
"Planning Time: 0.375 ms"
"Execution Time: 23617.348 ms"

That is doing a lot of reading from disk. What do you have shared_buffers set to? I'd expect better cache hits unless it is quite low or this is a query that differs greatly from the typical work.
Also, did you try adding the index I suggested? That lowest node has 488k rows coming out of it after throwing away 6.9 million. I would expect an index on only eventtype, replenishmenttype to be quite helpful. I don't assume you have tons of rows where serial_no is null.