Re: update from performance question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Albe


Thank you for your reply

The query changed a bit


explain (analyze, buffers)
UPDATE
    csischema.tf_transaction_item_person TRANS
SET 
    general_ledger_code = PURCH.general_ledger_code,
    general_ledger_code_desc = PURCH.general_ledger_code_desc,
    update_datetime = now()::timestamp(0)
FROM
    csischema.tf_purchases_person PURCH
WHERE
    PURCH.general_ledger_code IS NOT NULL AND
    TRANS.purchased_log_id = PURCH.purchased_log_id AND
    TRANS.general_ledger_code IS NULL
;

                                                             ^
select count(*) from csischema.tf_transaction_item_person where general_ledger_code is null;
  count  
---------
 1393515

select count(*) from csischema.tf_transaction_item_person ;
  count  
---------
 3408380

select count(*) from csischema.tf_purchases_person;
  count   
----------
 20760731

select count(*) from csischema.tf_purchases_person where general_ledger_code IS NOT NULL;
  count  
---------
 6909204


But the kicker is this

A select count to see how many records will be used for update gets me zero


select count(trans.purchased_log_id) from 
    csischema.tf_transaction_item_person TRANS,
    csischema.tf_purchases_person PURCH
WHERE
    PURCH.general_ledger_code IS NOT NULL AND
    TRANS.purchased_log_id = PURCH.purchased_log_id AND
    TRANS.general_ledger_code IS NULL
;
 count 
-------
     0
(1 row)

Considering this , I wonder if an index on csischema.tf_purchases_person (purchased_log_id, general_ledger_code) and one on tf_transaction_item_person (purchased_log_id, general_ledger_code) would not help ?

This is what bugs me. 

I got the explain out


without indexes


                                                                          QUERY PLAN                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on tf_transaction_item_person trans  (cost=1164684.43..1572235.51 rows=507748 width=227) (actual time=230320.060..230320.060 rows=0 loops=1)
   Buffers: shared hit=120188 read=876478, temp read=93661 written=93631
   ->  Hash Join  (cost=1164684.43..1572235.51 rows=507748 width=227) (actual time=230320.054..230320.054 rows=0 loops=1)
         Hash Cond: ((trans.purchased_log_id)::text = (purch.purchased_log_id)::text)
         Buffers: shared hit=120188 read=876478, temp read=93661 written=93631
         ->  Seq Scan on tf_transaction_item_person trans  (cost=0.00..228945.93 rows=1542683 width=199) (actual time=13.312..52046.689 rows=1393515 loops=1)
               Filter: (general_ledger_code IS NULL)
               Rows Removed by Filter: 2014865
               Buffers: shared read=191731
         ->  Hash  (cost=1012542.32..1012542.32 rows=6833049 width=52) (actual time=152339.000..152339.000 rows=6909204 loops=1)
               Buckets: 524288  Batches: 16  Memory Usage: 39882kB
               Buffers: shared hit=120188 read=684747, temp written=57588
               ->  Seq Scan on tf_purchases_person purch  (cost=0.00..1012542.32 rows=6833049 width=52) (actual time=8.252..140992.716 rows=6909204 loops=1)
                     Filter: (general_ledger_code IS NOT NULL)
                     Rows Removed by Filter: 13851527
                     Buffers: shared hit=120188 read=684747
 Planning time: 0.867 ms
 Execution time: 230328.223 ms
(18 rows)



with indexes



                                                                          QUERY PLAN                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on tf_transaction_item_person trans  (cost=1161742.22..1567806.87 rows=497927 width=228) (actual time=155171.388..155171.388 rows=0 loops=1)
   Buffers: shared hit=88095 read=908571, temp read=93661 written=93631
   ->  Hash Join  (cost=1161742.22..1567806.87 rows=497927 width=228) (actual time=155171.358..155171.358 rows=0 loops=1)
         Hash Cond: ((trans.purchased_log_id)::text = (purch.purchased_log_id)::text)
         Buffers: shared hit=88095 read=908571, temp read=93661 written=93631
         ->  Seq Scan on tf_transaction_item_person trans  (cost=0.00..228945.93 rows=1542683 width=199) (actual time=16.801..31016.221 rows=1393515 loops=1)
               Filter: (general_ledger_code IS NULL)
               Rows Removed by Filter: 2014865
               Buffers: shared read=191731
         ->  Hash  (cost=1012542.32..1012542.32 rows=6700872 width=53) (actual time=105101.946..105101.946 rows=6909204 loops=1)
               Buckets: 524288  Batches: 16  Memory Usage: 39882kB
               Buffers: shared hit=88095 read=716840, temp written=57588
               ->  Seq Scan on tf_purchases_person purch  (cost=0.00..1012542.32 rows=6700872 width=53) (actual time=13.823..95970.776 rows=6909204 loops=1)
                     Filter: (general_ledger_code IS NOT NULL)
                     Rows Removed by Filter: 13851527
                     Buffers: shared hit=88095 read=716840
 Planning time: 90.409 ms
 Execution time: 155179.181 ms
(18 rows)

Thanks
Armand



On Apr 19, 2017, at 3:06 AM, Albe Laurenz <laurenz.albe@xxxxxxxxxx> wrote:

> Armand Pirvu wrote:
>> Running 9.5.2
>> 
>> I have the following update and run into a bit of a trouble . I realize the tables
>> involved have quite some data but here goes
>> 
>> 
>> UPDATE
>>    tf_transaction_item_person TRANS
>> SET
>>    general_ledger_code = PURCH.general_ledger_code,
>>    general_ledger_code_desc = PURCH.general_ledger_code_desc,
>>    update_datetime = now()::timestamp(0)
>> FROM
>>   tf_purchases_person PURCH
>> WHERE
>>    PURCH.general_ledger_code != '' AND
>>    TRANS.purchased_log_id = PURCH.purchased_log_id AND
>>    TRANS.general_ledger_code != PURCH.general_ledger_code
>> ;
> [...]
>>                              Table "tf_transaction_item_person"
> [...]
>> Indexes:
>>    "tf_transaction_item_person_pkey" PRIMARY KEY, btree (person_transaction_item_id)
>>    "tf_tip_idx" btree (client_id, update_datetime)
>>    "tf_tip_isdel_idx" btree (show_id, person_transaction_item_id)
> 
> You don't show EXPLAIN (ANALYZE, BUFFERS) output for the problematic query,
> so it is difficult to say where the time is spent.
> 
> But since you say that the same query without the UPDATE also takes more than
> a minute, the duration for the UPDATE is not outrageous.
> It may well be that much of the time is spent updating the index
> entries for the 3.5 million affected rows.
> 
> I don't know if dropping indexes for the duration of the query and recreating
> them afterwards would be a net win, but you should consider it.
> 
> It may be that the only ways to improve performance would be general
> things like faster I/O, higher max_wal_size setting, and, most of all,
> enough RAM in the machine to contain the whole database.
> 
> Yours,
> Laurenz Albe



-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux