Re: Better index stategy for many fields with few values

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Jim,

Jim Nasby wrote:
> Adding -performance back in
>     I would like to try it.
> 
>     However in an other post I added that contrary to what I stated
>     initially all the paramXX columns are not mandatory in the query. So
>     it seems that requirement make the problem more complexe.

Okay, this rules out my functional index over 19 columns.

>     Doesn't this new requirement rule out this solution? 
> 
> No, just group the columns logically.

Yes, that's the solution.

If you have common groups of columns that appear and disappear
synchroneously, pack those together in an (possibly partitioned and/or
functional) index.

Then rely on the query planner that the combines the appropriate indices
via index bitmap scan.

>      By the way I have test to index each column individually and check
>     what happens in relation to bitscan map. My test table  is 1
>     million  rows. The explain analyze command shows that a bit scan is
>     sometimes used but I still end up with queries that can take up to
>     10s which is way to much.

Is it on the first query, or on repeated queries?

It might be that you're I/O bound, and the backend has to fetch indices
and rows from Disk into RAM.

I currently don't know whether the order of indices in a multi-index
bitmap scan is relevant, but I could imagine that it may be useful to
have the most selective index scanned first.

And keep in mind that, assuming an equal distribution of your
parameters, every index bitmap hits 1/10th of the whole table on
average, so the selectivity generally is low.

The selectivity of a partitioned 3-column index will be much better
(about 1/10000th of the whole table), and less index scans and bitmaps
have to be generated.

A functional index may also make sense to CLUSTER the table to optimize
the locality of search results (and so reducing disk I/O). In case your
table has low write activity, but high read-only activity, the overhead
that comes with the additional index is neglible compared to the
performance improvement proper CLUSTERing can generate.

Markus
-- 
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux