Re: Regrading brin_index on required column of the table

Durgamahesh Manne <maheshpostgres9@xxxxxxxxx> · Thu, 20 Sep 2018 17:18:41 +0530

On Thu, Sep 20, 2018 at 3:12 PM Durgamahesh Manne <maheshpostgres9@xxxxxxxxx> wrote:

Hi 

As per your suggestion 

i ran explain analyse for distinct query 

the size of the table1 is 30mb
the size of the table2 is 368kb

 EXPLAIN ANALYZE select  distinct sub_head."vchSubmittersCode" ,rec."vchFileName" , rec."vchCusipFundIDSubFundID" , rec."vchFundUnitPrice"  
,  sub_head."vchValuationDate" ,rec."vchAdvisorLabel" from  table1 rec join  table2 sub_head on rec."vchS
ubmittersCode"=sub_head."vchSubmittersCode" where  rec."bFetch"=false and sub_head."bFetch"=false ;

  Unique  (cost=5871873.64..6580630.73 rows=7400074 width=89) (actual time=326397.551..389515.863 rows=3700000 loops=1)                                                        |
|   ->  Sort  (cost=5871873.64..5973124.65 rows=40500405 width=89) (actual time=326397.550..372470.846 rows=40500000 loops=1)                                                  |
|         Sort Key: sub_head."vchSubmittersCode", rec."vchFileName", rec."vchCusipFundIDSubFundID", rec."vchFundUnitPrice", sub_head."vchValuationDate", rec."vchAdvisorLabel" |
|         Sort Method: external merge  Disk: 3923224kB                                                                                                                         |
|         ->  Hash Join  (cost=21.06..457723.28 rows=40500405 width=89) (actual time=0.339..6939.296 rows=40500000 loops=1)                                                    |
|               Hash Cond: ((rec."vchSubmittersCode")::text = (sub_head."vchSubmittersCode")::text)                                                                            |
|               ->  Seq Scan on table1 rec  (cost=0.00..1822.66 rows=100001 width=80) (actual time=0.011..56.998 rows=100000 loops=1)                    |
|                     Filter: (NOT "bFetch")                                                                                                                                   |
|                     Rows Removed by Filter: 4706                                                                                                                             |
|               ->  Hash  (cost=16.00..16.00 rows=405 width=11) (actual time=0.319..0.319 rows=405 loops=1)                                                                    |
|                     Buckets: 1024  Batches: 1  Memory Usage: 26kB                                                                                                            |
|                     ->  Seq Scan on table2 sub_head  (cost=0.00..16.00 rows=405 width=11) (actual time=0.005..0.248 rows=405 loops=1)             |
|                           Filter: (NOT "bFetch")                                                                                                                             |
|                           Rows Removed by Filter: 375                                                                                                                        |
| Planning time: 0.237 ms                                                                                                                                                      |
| Execution time: 390252.089 ms                                                             

so i am unable to reduce the query execution time as it is taken around 7 minutes to execute with indexes & without indexes 

 please help in reducing the query execution time 

Regards
Durgamahesh Manne

On Wed, Sep 19, 2018 at 9:07 PM Alban Hertroys <haramrae@xxxxxxxxx> wrote:
The results of explain analyze would shed light on the core problem.

My guess is that your conditions are not very selective - ie. most

records in both tables have bFetch = false - and therefore you are

retrieving most of your data and that is what's taking 7 minutes. No

index is going to fix that.

If those boolean values are distributed very unevenly (say 99.9% has

false and 0.1% has true), you may get better results by excluding the

records with 'true' values (instead of including those that are

'false'), for example by using a where not exists(...) subquery.

Obviously, that still won't help if you're just fetching a lot of data.

On Wed, 19 Sep 2018 at 16:23, Durgamahesh Manne

<maheshpostgres9@xxxxxxxxx> wrote:

>

>

>

>

>

>

> On Wed, Sep 19, 2018 at 7:41 PM Igor Neyman <ineyman@xxxxxxxxxxxxxx> wrote:

>>

>>

>>

>> From: Durgamahesh Manne [mailto:maheshpostgres9@xxxxxxxxx]

>> Sent: Wednesday, September 19, 2018 10:04 AM

>> To: Igor Neyman <ineyman@xxxxxxxxxxxxxx>

>> Subject: Re: Regrading brin_index on required column of the table

>>

>> On Wed, Sep 19, 2018 at 7:19 PM Igor Neyman <ineyman@xxxxxxxxxxxxxx> wrote:

>>

>>

>>

>> From: Durgamahesh Manne [mailto:maheshpostgres9@xxxxxxxxx]

>> Sent: Wednesday, September 19, 2018 9:43 AM

>> To: PostgreSQL mailing lists <pgsql-general@xxxxxxxxxxxxxx>

>> Subject: Regrading brin_index on required column of the table

>>

>> Hi

>>

>> Respected postgres community members

>>

>>

>>

>> I have created BRIN index on few columns of the table without any issues. But i am unable to create BRIN index on one column of the table as i got error listed below

>>

>>

>>

>>

>>

>> [local]:6263 postgres@huawei=# CREATE INDEX brin_idx on huawei using brin ("dFetch");

>>

>> ERROR:  data type boolean has no default operator class for access method "brin"

>>

>> HINT:  You must specify an operator class for the index or define a default operator class for the data type.

>>

>>

>>

>>  below is the column description:

>>

>> Column     datatype   collation    nullable       default    storage

>>

>>

>>

>>  dFetch        boolean                                         false        plain

>>

>>

>>

>>

>>

>>

>>

>> so please help in creating of the BRIN index on above column of the table .

>>

>>

>>

>>

>>

>>

>>

>> Regards

>>

>>

>>

>> Durgamahesh Manne

>>

>>

>>

>> Why would you want BRIN index on Boolean-type column?

>>

>> What kind of interval will you specify?

>>

>>

>>

>> Regards,

>>

>> Igor Neyman

>>

>>

>>

>>

>>

>>

>>

>>

>>

>>  Hi

>>

>>

>>

>>

>>

>>  I have complex query like for ex  select  distinct sub_head."vchSubmittersCode" ,rec."vchFileName" , rec."vchCusipFundIDSubFundID" , rec."vchFundUnitPrice"  ,  sub_head."vchValuationDate" ,rec."vchAdvisorLabel" from  "table1" rec join  "table2" sub_head on rec."vchSubmittersCode"=sub_head."vchSubmittersCode" where  rec."bFetch"=false and sub_head."bFetch"=false ;

>>

>>

>>

>>

>>

>> Query taken around 7 minutes time to execute without indexes on required columns

>>

>>

>>

>>  SO  i need to execute this distinct query at less time by creating indexes on required columns of the tables

>>

>>

>>

>> i have created brin indexes on vchsubmitterscode of two tables

>>

>>

>>

>> i am not able to create brin indexes on bfetch tables as i got a error  ERROR:  data type boolean has no default operator class for access method "brin"

>>

>> HINT:  You must specify an operator class for the index or define a default operator class for the data type.

>>

>>

>>

>>

>>

>> so please help in creating of the BRIN index on above column of the table as i need to reduce the query execution time

>>

>>

>>

>>

>>

>> Regards

>>

>>

>>

>> Durgamahesh Manne

>>

>> Again, BRIN indexes are not design to work on Boolean columns. If you want to index Boolean column, just create regular BTREE index.

>>

>> Regards,

>>

>> Igor

>>

>>

>>

>>

>

>

>  Hi

>

> I have already tried with BTREE indexes & HASH indexes on required columns .but distinct query execution time was not reduced

>

>

> Query taken around 7 minutes time to execute with BTREE indexes & HASH indexes on required columns

>

>

> Regards

>

> Durgamahesh Manne

>

-- 

Hi

Query was executed at less time without distinct 

As well as query was taking around 7 minutes to complete execution with distinct 

 select   distinct  sub_head."vchSubmittersCode" ,rec."vchFileName" , rec."vchCusipFundIDSubFundID" , rec."vchFundUnitPrice"  ,  sub_head."vchValuationDate" ,rec."vchAdvisorLabel" from  table1 rec join table2 sub_head on rec."vchSubmittersCode"=sub_head."vchSubmittersCode" where  rec."bFetch"=false and sub_head."bFetch"=false ;

I need to execute above distinct query at less time as distinct query was taking more time to execute  even i have created indexes on required columns of the tables

Regards
Durgamahesh Manne