Maybe also useful to know: cipafilter=# select reltuples from pg_class where relname = 'log_raw'; reltuples ------------- 5.40531e+08 (1 row) On Wed, Feb 15, 2017 at 7:55 PM, David Hinkle <hinkle@xxxxxxxxxxxxxx> wrote: > Thanks for your help! > > Karsten: The system does fill up swap before it blows up. This > particular model has 8G of ram and 4G of swap and runs kernel > 4.4.0-53-generic #74~14.04.1-Ubuntu. > > Tom, there are three columns in this table that exhibit the problem, > here is the statistics data after an analyze, and the real data to > compare it to. > > cipafilter=# select attname, n_distinct, most_common_freqs from > pg_stats where tablename = 'log_raw' and (attname = 'urlid' or attname > = 'titleid' or attname = 'hostid'); > attname | n_distinct | > > > > > > > most_common_freqs > ---------+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > urlid | 38963 | > {0.0188,0.009,0.00853333,0.0079,0.00686667,0.0068,0.00616667,0.0061,0.00606667,0.005,0.0049,0.00483333,0.00463333,0.00456667,0.0044,0.0044,0.0039,0.0032,0.00263333,0.00263333,0.00256667,0.00256667,0.00246667,0.0023,0.0023,0.00223333,0.00203333,0.002,0.002,0.002,0.0019,0.00186667,0.00183333,0.0018,0.0018,0.00176667,0.00176667,0.00176667,0.00176667,0.00173333,0.00173333,0.00173333,0.0017,0.0017,0.00166667,0.0016,0.00156667,0.00153333,0.0015,0.00146667,0.00143333,0.0014,0.0014,0.0014,0.0014,0.0014,0.0014,0.00136667,0.00133333,0.0013,0.0013,0.0013,0.0013,0.00126667,0.00126667,0.00123333,0.00123333,0.00123333,0.00123333,0.0012,0.0012,0.00113333,0.0011,0.00106667,0.00106667,0.001,0.001,0.001,0.001,0.000966667,0.000966667,0.000966667,0.000966667,0.000966667,0.000933333,0.000933333,0.000933333,0.0009,0.0009,0.0009,0.0009,0.0009,0.0009,0.0009,0.0009,0.0009,0.000866667,0.000866667,0.000866667,0.000866667} > hostid | 2478 | > {0.2109,0.0330667,0.0321667,0.0245,0.0223,0.0177667,0.0165,0.0152,0.0151333,0.014,0.0132,0.0115667,0.00993333,0.00963333,0.00873333,0.00853333,0.00853333,0.00746667,0.0074,0.00653333,0.0065,0.0065,0.00646667,0.00513333,0.00506667,0.00503333,0.00496667,0.00493333,0.0049,0.00486667,0.00483333,0.00453333,0.0044,0.00433333,0.00426667,0.00413333,0.00396667,0.00386667,0.00383333,0.00363333,0.00363333,0.00363333,0.00346667,0.00323333,0.00323333,0.0032,0.00316667,0.00303333,0.00303333,0.0029,0.00286667,0.00273333,0.00273333,0.00266667,0.00263333,0.00263333,0.0026,0.0026,0.0025,0.00243333,0.00243333,0.0024,0.00236667,0.00226667,0.00223333,0.00216667,0.00213333,0.0021,0.0021,0.00206667,0.00206667,0.00203333,0.002,0.002,0.00193333,0.0019,0.0019,0.00186667,0.00186667,0.00186667,0.0018,0.0018,0.0018,0.00176667,0.00173333,0.0017,0.0017,0.00166667,0.00166667,0.00166667,0.00163333,0.00163333,0.00153333,0.0015,0.0015,0.0015,0.00146667,0.00146667,0.00146667,0.00146667} > titleid | 292 | {0.767167} > (3 rows) > > I have to patch the pg_stats table to get postgres to run the > following queries without crashing: > > cipafilter=# UPDATE pg_statistic AS s > cipafilter-# SET stadistinct = (select reltuples from pg_class > where relname = 'titles') > cipafilter-# FROM pg_class c, pg_attribute a where c.oid = > s.starelid and c.oid = a.attrelid and a.attnum = s.staattnum and > cipafilter-# relname = 'log_raw' and attname = 'titleid'; > UPDATE 1 > cipafilter=# UPDATE pg_statistic AS s > cipafilter-# SET stadistinct = (select reltuples from pg_class > where relname = 'urls') > cipafilter-# FROM pg_class c, pg_attribute a where c.oid = > s.starelid and c.oid = a.attrelid and a.attnum = s.staattnum and > cipafilter-# relname = 'log_raw' and attname = 'urlid'; > UPDATE 1 > cipafilter=# UPDATE pg_statistic AS s > cipafilter-# SET stadistinct = (select reltuples from pg_class > where relname = 'hosts') > cipafilter-# FROM pg_class c, pg_attribute a where c.oid = > s.starelid and c.oid = a.attrelid and a.attnum = s.staattnum and > cipafilter-# relname = 'log_raw' and attname = 'hostid'; > UPDATE 1 > > cipafilter=# select attname, n_distinct from pg_stats where tablename > = 'log_raw' and (attname = 'urlid' or attname = 'titleid' or attname = > 'hostid'); > attname | n_distinct > ---------+------------- > urlid | 1.51452e+08 > hostid | 303756 > titleid | 879485 > > cipafilter=# select titleid, count(titleid) from log_raw group by > titleid order by count(titleid) desc limit 10; > titleid | count > ---------+----------- > 1 | 423848049 > 49547 | 403432 > 238 | 188151 > 12 | 151640 > 5 | 149524 > 6196 | 139445 > 32014 | 123828 > 200 | 88682 > 58921 | 86451 > 10 | 84264 > (10 rows) > > cipafilter=# select urlid, count(urlid) from log_raw group by urlid > order by count(urlid) desc limit 10; > urlid | count > --------+---------- > 129991 | 10843088 > 1 | 4953757 > 21 | 4345503 > 2765 | 4266981 > 12 | 3654127 > 920 | 3609054 > 1135 | 3562185 > 20 | 3495023 > 283567 | 3019675 > 2494 | 2655301 > (10 rows) > > cipafilter=# select hostid, count(hostid) from log_raw group by hostid > order by count(hostid) desc limit 10; > hostid | count > --------+----------- > 7 | 117342686 > 5 | 18016481 > 53 | 17408992 > 57 | 12947564 > 543 | 12698269 > 1 | 10068246 > 127544 | 8746204 > 27 | 8618595 > 40 | 8507278 > 36 | 7424412 > (10 rows) > > Fun fact, hostid 7 is 'google.com'. > > On Wed, Feb 15, 2017 at 6:11 AM, Karsten Hilbert > <Karsten.Hilbert@xxxxxxx> wrote: >> On Wed, Feb 15, 2017 at 01:04:51PM +0100, Karsten Hilbert wrote: >> >>> > Nope, that pops too. The query runs for a long time at a somewhat >>> > normal rate of ram consumption, using ~1G of RSS then suddenly spikes >>> > to about 6G, at which point the OOM killer pops it. Box has 8G of ram >>> > and 4G of swap. >>> >>> By any chance: >>> >>> - when it happens has the kernel considered using swap ? >>> >>> - which kernel are you running ? >>> >>> There's been (for some workloads) massive problems with RAM >>> exhaustion / swapping / OOM killer going wild with >>> 4.7/4.8/some 4.9 kernels. >> >> I guess what I'm trying to say is that it may actually not be >> PostgreSQL's fault but rather the kernel invoking the OOM >> killer way prematurely. >> >> Karsten >> -- >> GPG key ID E4071346 @ eu.pool.sks-keyservers.net >> E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > > > -- > David Hinkle > > Senior Software Developer > > Phone: 800.243.3729x3000 > > Email: hinkle@xxxxxxxxxxxxxx > > Hours: Mon-Fri 8:00AM-5:00PM (CT) -- David Hinkle Senior Software Developer Phone: 800.243.3729x3000 Email: hinkle@xxxxxxxxxxxxxx Hours: Mon-Fri 8:00AM-5:00PM (CT) -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general