Search Postgresql Archives

Re: Queries on very big table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 2, 2017 at 5:23 AM, Job <Job@xxxxxxxxxxxxxxxxxxxx> wrote:
> Hello guys and very good new year to everybody!
>
> We are now approaching some queries and statistics on very big table (about 180 millions of record).
> The table is partitioned by day (about ~3 Gb of data for every partition/day).
> We use Postgresql 9.6.1
>
> I am experiencing quite important slowdown on queries.
> I manually made a "vacuum full" and a "reindex" on every partition in order to clean free space and reorder records.
>
> I have a BRIN index on timestamp and index on other field (btree)
>
> Starting by a simple query: explain analyze select count(domain) from webtraffic_archive:
>
>                                                                                    QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Finalize Aggregate  (cost=3220451.94..3220451.95 rows=1 width=8) (actual time=36912.624..36912.624 rows=1 loops=1)
>    ->  Gather  (cost=3220451.52..3220451.93 rows=4 width=8) (actual time=36911.600..36912.614 rows=5 loops=1)
>          Workers Planned: 4
>          Workers Launched: 4
>          ->  Partial Aggregate  (cost=3219451.52..3219451.53 rows=1 width=8) (actual time=36906.804..36906.804 rows=1 loops=5)
>                ->  Append  (cost=0.00..3094635.41 rows=49926443 width=0) (actual time=4.716..31331.229 rows=39853988 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive  (cost=0.00..0.00 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_01_01  (cost=0.00..10.47 rows=47 width=0) (actual time=0.000..0.000 rows=0 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_16  (cost=0.00..213728.26 rows=3498026 width=0) (actual time=4.713..2703.458 rows=2798421 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_17  (cost=0.00..201379.39 rows=3247739 width=0) (actual time=6.334..2364.726 rows=2598191 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_18  (cost=0.00..176248.86 rows=2824986 width=0) (actual time=7.437..2014.812 rows=2259989 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_19  (cost=0.00..177493.33 rows=2866433 width=0) (actual time=9.951..2145.958 rows=2293146 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_20  (cost=0.00..120271.83 rows=1960883 width=0) (actual time=0.011..372.092 rows=1568706 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_21  (cost=0.00..276391.94 rows=4485294 width=0) (actual time=5.386..3111.589 rows=3588235 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_22  (cost=0.00..287611.68 rows=4630668 width=0) (actual time=6.598..3335.834 rows=3704535 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_23  (cost=0.00..249047.61 rows=4014361 width=0) (actual time=7.206..2628.884 rows=3211489 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_24  (cost=0.00..192008.70 rows=3097370 width=0) (actual time=9.870..1882.826 rows=2477896 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_25  (cost=0.00..87385.16 rows=1405616 width=0) (actual time=0.018..427.248 rows=1124493 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_26  (cost=0.00..88262.80 rows=1436080 width=0) (actual time=0.014..277.327 rows=1148864 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_27  (cost=0.00..222607.43 rows=3557243 width=0) (actual time=8.497..1232.210 rows=2845795 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_28  (cost=0.00..210414.76 rows=3365676 width=0) (actual time=0.033..548.878 rows=2692541 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_29  (cost=0.00..185065.72 rows=2955872 width=0) (actual time=0.031..498.079 rows=2364697 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_30  (cost=0.00..149139.55 rows=2382656 width=0) (actual time=0.011..501.351 rows=1906124 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2016_12_31  (cost=0.00..166991.89 rows=2664288 width=0) (actual time=0.041..437.631 rows=2131431 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2017_01_01  (cost=0.00..79197.29 rows=1260930 width=0) (actual time=0.018..254.124 rows=1008744 loops=5)
>                      ->  Parallel Seq Scan on webtraffic_archive_day_2017_01_02  (cost=0.00..11378.74 rows=272274 width=0) (actual time=0.017..34.352 rows=130691 loops=5)
>  Planning time: 313.907 ms
>  Execution time: 36941.700 ms
>
> Other more complex queries are slower.
>
> How can i improve it?
> Records number can raise up until 1.000 millions.
> Do i need a third-part tool for big data?

What storage do you have?

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux