Hi Maxim,
Thanks for your help.
Jeff has pointed me in the right direction here, it seems that the rows=1 is due to a long running transaction which deletes all of common_student for school_id 36 and then repopulates it.
I was unaware that the delete inside the transaction would affect the VACUUM ANALYSE. As I said to Jeff I will have to rewrite the application to update the rows if this is the normal behaviour.
Here are the queries:
db=> explain analyze select * from common_student where school_id = 36;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on common_student (cost=88.30..3846.49 rows=1533 width=384) (actual time=4.852..7.065 rows=1388 loops=1)
Recheck Cond: (school_id = 36)
Heap Blocks: exact=67
-> Bitmap Index Scan on idx_common_student_sid (cost=0.00..87.91 rows=1533 width=0) (actual time=4.817..4.817 rows=1388 loops=1)
Index Cond: (school_id = 36)
Planning time: 0.097 ms
Execution time: 8.084 ms
(7 rows)
db=> /* At this point I have started a long running transaction that deletes all of common_student for school_id 36 */ ;
db=> analyse verbose common_student(school_id);
INFO: analyzing "public.common_student"
INFO: "common_student": scanned 7322 of 7322 pages, containing 65431 live rows and 8060 dead rows; 56818 rows in sample, 65431 estimated total rows
ANALYZE
db=> explain analyze select * from common_student where school_id = 36;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_common_student_sid on common_student (cost=0.41..8.43 rows=1 width=384) (actual time=0.017..1.498 rows=1388 loops=1)
Index Cond: (school_id = 36)
Planning time: 0.098 ms
Execution time: 2.583 ms
(4 rows)
db=> create table test_table AS SELECT * from common_student;
SELECT 65431
db=> vacuum analyze test_table;
VACUUM
db=> explain analyze select * from test_table where school_id = 36;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on test_table (cost=0.00..3848.89 rows=1381 width=384) (actual time=10.105..22.680 rows=1388 loops=1)
Filter: (school_id = 36)
Rows Removed by Filter: 64043
Planning time: 0.390 ms
Execution time: 23.767 ms
(5 rows)
db=> drop table test_table;
DROP TABLE
db=> create table test_table AS SELECT * from common_student ORDER BY school_id;
SELECT 65431
db=> vacuum analyze test_table;
VACUUM
db=> explain analyze select * from test_table where school_id = 36;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on test_table (cost=0.00..3850.89 rows=1341 width=382) (actual time=5.674..27.585 rows=1388 loops=1)
Filter: (school_id = 36)
Rows Removed by Filter: 64043
Planning time: 0.264 ms
Execution time: 28.643 ms
(5 rows)
db=> explain analyze select * from test_table where school_id = 36;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on test_table (cost=0.00..3850.89 rows=1341 width=382) (actual time=20.848..43.272 rows=1388 loops=1)
Filter: (school_id = 36)
Rows Removed by Filter: 64043
Planning time: 0.068 ms
Execution time: 44.423 ms
(5 rows)
db=> /* At this point I have killed the long running transaction that deletes all of common_student for school_id 36 */ ;
db=> vacuum analyze common_student;
VACUUM
db=> explain analyze select * from common_student where school_id = 36;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on common_student (cost=79.17..3357.79 rows=1388 width=383) (actual time=0.088..1.302 rows=1388 loops=1)
Recheck Cond: (school_id = 36)
Heap Blocks: exact=67
-> Bitmap Index Scan on idx_common_student_sid (cost=0.00..78.83 rows=1388 width=0) (actual time=0.077..0.077 rows=1388 loops=1)
Index Cond: (school_id = 36)
Planning time: 0.327 ms
Execution time: 2.311 ms
(7 rows)
On Sun, 23 Dec 2018 at 15:28 Maxim Boguk <maxim.boguk@xxxxxxxxx> wrote:
Hi Mark,It's look very weird.Can you try something like this (check that you have enough disk space for second copy of common_student before):create table test_table AS SELECT * from common_student;Vacuum analyze test_table;explain analyze select * from test_table where school_id = 36;drop table test_table;create table test_table AS SELECT * from common_student ORDER BY school_id;Vacuum analyze test_table;explain analyze select * from test_table where school_id = 36;drop table test_table;And provide results of both explain analyze queries.On Sat, Dec 22, 2018 at 3:39 AM Mark <mwchambers@xxxxxxxxx> wrote:Hi All,I'm seeing some very slow queries and it looks like the query planner is deciding to do a 'Nested Loop Left Join'which is slow. When I SET enable_nestloop=OFF for testing it does a 'Hash Left Join' which is much faster.I think I can see the cause of the problem in the examples below. I can't make sense of the statistics being writtenby ANALYSE or the planners row estimates.Can anyone help me understand....- Why the row estimate I get in the query below for school_id = 36 is 1 ? ( I expect it to be higher)- Why does '1' appear in the most_common_vals when it is actually the least common value.- Why doesn't 36 appear in the most_common_vals (it is more common than 1)- Does the analyse output below mean that it only scanned 51538 of 65463 rows in the table? Is school_id 36 just being missed in the sample? (This happens when the analyse is repeated )Any help with understanding what's happening here would be much appreciated.I hope I've provided enough information below.Thanks,Markdb=> explain analyse select * from common_student where school_id = 36 ;QUERY PLAN--------------------------------------------------------------------------------------------------------------------------------------------Index Scan using idx_common_student_sid on common_student (cost=0.41..8.39 rows=1 width=385) (actual time=0.264..1.691 rows=1388 loops=1)Index Cond: (school_id = 36)Planning time: 0.087 msExecution time: 2.706 ms(4 rows)db=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct from pg_stats where attname='school_id' and tablename='common_student';tablename | attname | most_common_vals | histogram_bounds | n_distinct----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------common_student | school_id | {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} | | 45(1 row)db=> select count(distinct(school_id)) from common_student ;count-------55(1 row)db=> alter table common_student alter column school_id set statistics 100000;WARNING: lowering statistics target to 10000ALTER TABLEdb=> analyse verbose common_student(school_id);INFO: analyzing "public.common_student"INFO: "common_student": scanned 7322 of 7322 pages, containing 65463 live rows and 49026 dead rows; 51538 rows in sample, 65463 estimated total rowsANALYZEdb=> select tablename,attname,most_common_vals,histogram_bounds,n_distinct from pg_stats where attname='school_id' and tablename='common_student';tablename | attname | most_common_vals | histogram_bounds | n_distinct----------------+-----------+----------------------------------------------------------------------------------------------------------------------------------------+------------------+------------common_student | school_id | {79,60,25,61,59,69,86,77,64,33,56,78,58,81,41,97,22,67,38,23,3,72,92,93,48,24,96,26,75,90,70,52,51,21,14,91,83,54,85,11,68,94,53,88,1} | | 45(1 row)db=> explain analyse select * from common_student where school_id = 36 ;QUERY PLAN--------------------------------------------------------------------------------------------------------------------------------------------Index Scan using idx_common_student_sid on common_student (cost=0.41..8.39 rows=1 width=385) (actual time=0.542..4.022 rows=1388 loops=1)Index Cond: (school_id = 36)Planning time: 0.334 msExecution time: 6.542 ms(4 rows)db=> select school_id, count(*) from common_student group by school_id order by count(*) limit 6 ;school_id | count-----------+-------1 | 5088 | 16153 | 25294 | 42231 | 43468 | 454(6 rows)dvpjxbzc=> select school_id, count(*) from common_student where school_id = 36 group by school_id ;school_id | count-----------+-------36 | 1388(1 row)db=> explain analyse select * from common_student where school_id = 1 ;QUERY PLAN----------------------------------------------------------------------------------------------------------------------------------Bitmap Heap Scan on common_student (cost=4.91..243.76 rows=64 width=385) (actual time=0.128..0.267 rows=50 loops=1)Recheck Cond: (school_id = 1)Heap Blocks: exact=16-> Bitmap Index Scan on "unique common_student" (cost=0.00..4.90 rows=64 width=0) (actual time=0.110..0.110 rows=50 loops=1)Index Cond: (school_id = 1)Planning time: 0.177 msExecution time: 0.414 ms(7 rows)db=> select VERSION();version-----------------------------------------------------------------------------PostgreSQL 10.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2, 64-bit(1 row)
--Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/
Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"