Search Postgresql Archives

Re: Query not using index pgsql 8.2.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Never mind my last post regarding the Bitmap heap scan. That is all in order.

Now it works perfectly. Went from 30+ seconds to 179ms. =)

I'm just amazed that setting the statistics threshold on one column mad all the difference. IS there any guidelines on what columns I should change the statistics on?


Thanks again!!!

23 mar 2007 kl. 14:04 skrev Henrik Zagerholm:


23 mar 2007 kl. 13:34 skrev Michael Fuhr:

On Fri, Mar 23, 2007 at 12:41:58PM +0100, Henrik Zagerholm wrote:
23 mar 2007 kl. 12:33 skrev Michael Fuhr:
The row count estimate for fk_filetype_id = 83 is high by an order
of magnitude:

Bitmap Index Scan on tbl_file_idx6  (cost=0.00..25.65 rows=1251
width=0) (actual time=21.958..21.958 rows=112 loops=1)

Index Cond: (fk_filetype_id = 83)

Have you run ANALYZE or VACUUM ANALYZE on these tables recently?
If so then you might try increasing the statistics target for
tbl_file.fk_filetype_id and perhaps some of the columns in the join
conditions.

I did a vacuum full and reindex on all tables.

VACUUM FULL is seldom (if ever) necessary if you're running plain
VACUUM (without FULL) often enough, either manually or via autovacuum.

Now I also did a vacuum analyze on tbl_acl (the biggest table with
about 4.5 millin rows)

Same result.

I'd suggest analyzing all tables.  The bad estimate I mentioned
appears to be for a column in tbl_file so if you didn't analyze
that table then the query plan probably won't improve.

But do you mean if the row_count estimate is big it can't use any
index on any other table within the JOINs?

High row count estimates make the planner think that scanning entire
tables would be faster than using indexes.  The more of a table a
query must fetch the less efficient an index scan becomes, to the
point that a sequential scan is faster than an index scan.

Any specific parameters I should adjust?

If analyzing the tables doesn't improve the row count estimates
then try increasing some columns' statistics targets and re-analyze
the table or just that column.  Example:

ALTER TABLE tbl_file ALTER fk_filetype_id SET STATISTICS 100;
ANALYZE tbl_file (fk_filetype_id);

I analyzed all tables and found no difference but after changing the statistics for fk_filetype_id I got way better results. Still there is a big cost on table tbl_file_structure_id it is making a Bitmap heap scan.
Could I do something to improve this?

Thanks for all your help so far!

Limit (cost=115168.22..115169.12 rows=20 width=173) (actual time=5138.401..5138.982 rows=20 loops=1) -> GroupAggregate (cost=115168.22..115341.60 rows=3853 width=173) (actual time=5138.394..5138.929 rows=20 loops=1) -> Sort (cost=115168.22..115177.85 rows=3853 width=173) (actual time=5138.339..5138.430 rows=73 loops=1) Sort Key: tbl_file.file_name, tbl_file.file_ctime, tbl_structure.structure_path, tbl_computer.pk_computer_id, tbl_filetype.filetype_icon, tbl_computer.computer_name, tbl_share.share_name, tbl_share.share_path, tbl_file_structure.fk_file_id -> Nested Loop (cost=8.48..114938.74 rows=3853 width=173) (actual time=185.588..5118.684 rows=730 loops=1) -> Seq Scan on tbl_filetype (cost=0.00..1.25 rows=1 width=18) (actual time=0.036..0.053 rows=1 loops=1)
                          Filter: (83 = pk_filetype_id)
-> Hash Join (cost=8.48..114898.96 rows=3853 width=171) (actual time=185.538..5116.063 rows=730 loops=1) Hash Cond: (tbl_archive.fk_share_id = tbl_share.pk_share_id) -> Nested Loop (cost=6.27..114843.77 rows=3853 width=146) (actual time=167.608..5093.255 rows=730 loops=1) -> Nested Loop (cost=6.27..32079.13 rows=1167 width=146) (actual time=130.594..3448.927 rows=330 loops=1) Join Filter: (tbl_archive.pk_archive_id = tbl_structure.fk_archive_id) -> Nested Loop (cost=4.89..31185.00 rows=1167 width=138) (actual time=119.861..3402.527 rows=330 loops=1) -> Nested Loop (cost=4.89..24413.49 rows=1167 width=56) (actual time=82.203..2097.142 rows=330 loops=1) -> Index Scan using tbl_file_idx6 on tbl_file (cost=0.00..344.56 rows=114 width=40) (actual time=13.164..15.475 rows=106 loops=1) Index Cond: (fk_filetype_id = 83) -> Bitmap Heap Scan on tbl_file_structure (cost=4.89..210.27 rows=69 width=24) (actual time=8.736..19.606 rows=3 loops=106) Recheck Cond: (tbl_file.pk_file_id = tbl_file_structure.fk_file_id) -> Bitmap Index Scan on tbl_file_structure_idx (cost=0.00..4.88 rows=69 width=0) (actual time=2.574..2.574 rows=3 loops=106) Index Cond: (tbl_file.pk_file_id = tbl_file_structure.fk_file_id) -> Index Scan using tbl_structure_pkey on tbl_structure (cost=0.00..5.79 rows=1 width=98) (actual time=3.942..3.945 rows=1 loops=330) Index Cond: (tbl_structure.pk_structure_id = tbl_file_structure.fk_structure_id) -> Materialize (cost=1.37..1.71 rows=34 width=16) (actual time=0.034..0.077 rows=34 loops=330) -> Seq Scan on tbl_archive (cost=0.00..1.34 rows=34 width=16) (actual time=10.606..10.693 rows=34 loops=1) Filter: archive_complete -> Index Scan using tbl_acl_idx on tbl_acl (cost=0.00..70.47 rows=36 width=8) (actual time=4.964..4.971 rows=2 loops=330) Index Cond: (tbl_acl.fk_file_structure_id = tbl_file_structure.pk_file_structure_id) -> Hash (cost=2.16..2.16 rows=4 width=41) (actual time=17.890..17.890 rows=4 loops=1) -> Hash Join (cost=1.07..2.16 rows=4 width=41) (actual time=17.848..17.872 rows=4 loops=1) Hash Cond: (tbl_share.fk_computer_id = tbl_computer.pk_computer_id) -> Seq Scan on tbl_share (cost=0.00..1.04 rows=4 width=29) (actual time=0.093..0.099 rows=4 loops=1) -> Hash (cost=1.03..1.03 rows=3 width=20) (actual time=17.724..17.724 rows=3 loops=1) -> Seq Scan on tbl_computer (cost=0.00..1.03 rows=3 width=20) (actual time=17.697..17.706 rows=3 loops=1)
Total runtime: 5139.608 ms



The default statistics target is 10; the maximum value is 1000.

On some systems I've found that reducing random_page_cost from 4
(the default) to 2 gives more realistic cost estimates for index
scans.

--
Michael Fuhr


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux