Search Postgresql Archives

Re: how to limit statement memory allocation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Tue, Mar 9, 2021 at 6:53 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Radoslav Nedyalkov <rnedyalkov@xxxxxxxxx> writes:
> On Tue, Mar 9, 2021 at 6:03 PM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>> Perhaps the accumulation is happening on the client side?  libpq doesn't
>> have any provision for spilling a result set to disk.

> Ah, I named it result set wrongly perhaps.
> These are queries , part of a larger ETL function or statement which at the
> end just write to a table.
> The huge join is an intermediate step.

Hm.  What's the query plan look like?

The only thing I'm aware of that could consume unbounded memory
on the server side is hash aggregation.  (v13 has improved that
situation, but I'm guessing you are running some older version.)
The planner won't choose hash aggregation if it estimates that
the hash table would need to exceed work_mem, but its estimate
could be wrong.  If that's the scenario, you could back off
work_mem until hash aggregation isn't used, or you could try to
improve the planner's estimates.  If your ETL process involves
intermediate tables, you might need to explicitly ANALYZE those
after you fill them so that the planner has a better idea how
to plan the later steps.

                        regards, tom lanea
 
Thanks Tom,
Query and plan attached.
 
Rado
db=# EXPLAIN 
db-#         CREATE TEMP TABLE gross_set_merchants AS (
db(#                 WITH gs_merchants as (
db(#                         SELECT
db(#                                 u.merchant_id,
db(#                                 uel.user_id
db(#                         FROM public.user_event_logs as uel
db(#                         JOIN users as u ON uel.user_id = u.id
db(#                         where 
db(#                                 --u.merchant_id = 1030616
db(#                                 uel.created_at >= '2018-01-01'
db(#                                 AND uel.category = 'settings'
db(#                                 AND uel.description IN ('Gross settlement enabled','Gross settlement disabled')
db(#                         GROUP BY 1,2
db(#                 ), setting_change as (
db(#                         SELECT
db(#                                 gsm.merchant_id,
db(#                                 cer.date,
db(#                                 SUM(CASE 
db(#                                         WHEN uel.description = 'Gross settlement enabled' THEN 1
db(#                                         WHEN uel.description = 'Gross settlement disabled' THEN -1
db(#                                         ELSE 0
db(#                                 END) change_setting
db(#                         FROM external.currency_exchange_rates as cer
db(#                         CROSS JOIN gs_merchants as gsm --full join, having all days for all gs merchants
db(#                         LEFT JOIN public.user_event_logs as uel ON uel.created_at::date = cer.date 
db(#                                 AND gsm.user_id = uel.user_id 
db(#                                 AND uel.created_at >= '2018-01-01'
db(#                                 AND uel.category = 'settings'
db(#                                 AND uel.description IN ('Gross settlement enabled','Gross settlement disabled')
db(#                         WHERE cer.currency = 'GBP'
db(#                                 AND cer.date >= '2018-01-01'
db(#                         GROUP BY 1,2
db(#                                 
db(#                 )
db(#                 
db(#                 SELECT
db(#                         sc.merchant_id,
db(#                         sc.date,
db(#                         SUM(change_setting) OVER (PARTITION BY sc.merchant_id ORDER BY date) as gs_on
db(#                 FROM setting_change as sc); 
                                                                                                                             QUERY PLAN                                                                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=183885316.97..183890372.97 rows=252800 width=40)
   CTE gs_merchants
     ->  Group  (cost=2475689.43..2534384.63 rows=567218 width=8)
           Group Key: u.merchant_id, uel.user_id
           ->  Gather Merge  (cost=2475689.43..2532021.22 rows=472682 width=8)
                 Workers Planned: 2
                 ->  Group  (cost=2474689.41..2476461.97 rows=236341 width=8)
                       Group Key: u.merchant_id, uel.user_id
                       ->  Sort  (cost=2474689.41..2475280.26 rows=236341 width=8)
                             Sort Key: u.merchant_id, uel.user_id
                             ->  Parallel Hash Join  (cost=265272.50..2453595.37 rows=236341 width=8)
                                   Hash Cond: (uel.user_id = u.id)
                                   ->  Parallel Seq Scan on user_event_logs uel  (cost=0.00..2187702.48 rows=236341 width=4)
                                         Filter: ((created_at >= '2018-01-01 00:00:00'::timestamp without time zone) AND ((description)::text = ANY ('{"Gross settlement enabled","Gross settlement disabled"}'::text[])) AND ((category)::text = 'settings'::text))
                                   ->  Parallel Hash  (cost=209818.33..209818.33 rows=4436333 width=8)
                                         ->  Parallel Seq Scan on users u  (cost=0.00..209818.33 rows=4436333 width=8)
   CTE setting_change
     ->  HashAggregate  (cost=181320662.52..181323190.52 rows=252800 width=16)
           Group Key: gsm.merchant_id, cer.date
           ->  Merge Left Join  (cost=161250580.17..170174828.82 rows=891666696 width=71)
                 Merge Cond: ((gsm.user_id = uel_1.user_id) AND (cer.date = ((uel_1.created_at)::date)))
                 ->  Sort  (cost=158950948.16..161180114.90 rows=891666696 width=12)
                       Sort Key: gsm.user_id, cer.date
                       ->  Nested Loop  (cost=0.29..11157645.37 rows=891666696 width=12)
                             ->  CTE Scan on gs_merchants gsm  (cost=0.00..11344.36 rows=567218 width=8)
                             ->  Materialize  (cost=0.29..471.24 rows=1572 width=4)
                                   ->  Index Only Scan using currency_exchange_rates_pk on currency_exchange_rates cer  (cost=0.29..463.38 rows=1572 width=4)
                                         Index Cond: ((date >= '2018-01-01'::date) AND (currency = 'GBP'::text))
                 ->  Sort  (cost=2299632.01..2301050.06 rows=567218 width=75)
                       Sort Key: uel_1.user_id, ((uel_1.created_at)::date)
                       ->  Gather  (cost=1000.00..2245424.28 rows=567218 width=75)
                             Workers Planned: 2
                             ->  Parallel Seq Scan on user_event_logs uel_1  (cost=0.00..2187702.48 rows=236341 width=75)
                                   Filter: ((created_at >= '2018-01-01 00:00:00'::timestamp without time zone) AND ((description)::text = ANY ('{"Gross settlement enabled","Gross settlement disabled"}'::text[])) AND ((category)::text = 'settings'::text))
   ->  Sort  (cost=27741.81..28373.81 rows=252800 width=16)
         Sort Key: sc.merchant_id, sc.date
         ->  CTE Scan on setting_change sc  (cost=0.00..5056.00 rows=252800 width=16)
(37 rows)

Time: 15.080 ms
db=# 


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux