Re: Bad planning data resulting in OOM killing of postgres

David Hinkle <hinkle@xxxxxxxxxxxxxx> · Thu, 16 Feb 2017 17:27:50 -0600

I guess this doesn't work, latest test run crashed.  It still uses the
bad plan for the hostid column even after n_distinct is updated.

cipafilter=# select attname, n_distinct from pg_stats where tablename
cipafilter-# = 'log_raw' and (attname = 'urlid' or attname = 'titleid'
or attname =
cipafilter(# 'hostid');
 attname | n_distinct
---------+-------------
 urlid   | 1.51625e+08
 hostid  |      304710
 titleid |      886499
(3 rows)

cipafilter=# explain DELETE FROM hosts WHERE NOT EXISTS ( SELECT
log_raw.hostid FROM log_raw WHERE log_raw.hostid = hosts.hostid );
                                       QUERY PLAN
-----------------------------------------------------------------------------------------
 Delete on hosts  (cost=22249475.67..74844813.47 rows=1 width=12)
   ->  Hash Anti Join  (cost=22249475.67..74844813.47 rows=1 width=12)
         Hash Cond: (hosts.hostid = log_raw.hostid)
         ->  Seq Scan on hosts  (cost=0.00..5017.10 rows=304710 width=10)
         ->  Hash  (cost=12799395.52..12799395.52 rows=543645052 width=10)
               ->  Seq Scan on log_raw  (cost=0.00..12799395.52
rows=543645052 width=10)
(6 rows)

I guess I will also try throwing in 'set enable_hashjoin = false;' and
see if that gets these purges to go.

On Thu, Feb 16, 2017 at 2:22 PM, David Hinkle <hinkle@xxxxxxxxxxxxxx> wrote:
> Yep, 420ish million records out of 540 million records have a titleid
> of 1.   There are about 880,000 other unique values, but most of the
> records are 1.   Of course, n_distinct is only 292.   I'm surprised
> it's not eliminating the duplicates while it builds that hash table.
>
> This is what I'm doing for a work around right now.  Getting
> n_distinct right seems to be preventing the system from breaking.
> It's going to be executed once a week during the weekly maintenance.
> It's setting the n_distinct of each column to the number of rows in
> the associated table.
>
>     CREATE OR REPLACE FUNCTION patch_ndistinct(_table varchar, _column
> varchar, _string_table varchar)
>       RETURNS real AS
>     $$
>          DECLARE _cnt REAL;
>     BEGIN
>          SELECT reltuples INTO _cnt from pg_class where relname = _string_table;
>        EXECUTE 'ALTER TABLE ' || _table || ' ALTER COLUMN ' || _column
> || ' SET (n_distinct=' || _cnt || ')';
>        RETURN _cnt;
>     END
>     $$ LANGUAGE plpgsql;
>     select patch_ndistinct('log_raw', 'titleid', 'titles');
>     select patch_ndistinct('log_raw', 'urlid', 'urls');
>     select patch_ndistinct('log_raw', 'hostid', 'hosts');
>     ANALYZE log_raw;
>
> On Thu, Feb 16, 2017 at 10:54 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>> David Hinkle <hinkle@xxxxxxxxxxxxxx> writes:
>>> Tom, there are three columns in this table that exhibit the problem,
>>> here is the statistics data after an analyze, and the real data to
>>> compare it to.
>>
>>>  attname | n_distinct |  most_common_freqs
>>
>>>  titleid |        292 | {0.767167}
>>
>> Ouch.  That's saying there's some single value of titleid that accounts
>> for more than three-quarters of the entries ... does that square with
>> reality?  That'd certainly explain why a hash join goes nuts.
>>
>>                         regards, tom lane
>
>
>
> --
> David Hinkle
>
> Senior Software Developer
>
> Phone:  800.243.3729x3000
>
> Email:  hinkle@xxxxxxxxxxxxxx
>
> Hours:  Mon-Fri   8:00AM-5:00PM (CT)

-- 
David Hinkle

Senior Software Developer

Phone:  800.243.3729x3000

Email:  hinkle@xxxxxxxxxxxxxx

Hours:  Mon-Fri   8:00AM-5:00PM (CT)

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general