Search Postgresql Archives

Re: Index over only uncommon values in table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jun 18, 2013, at 2:29 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote:

> On Tue, Jun 18, 2013 at 12:17 PM, Steven Schlansker <steven@xxxxxxxxxxxx> wrote:
> Hi everyone,
> 
> I assume this is not easy with standard PG but I wanted to double check.
> 
> I have a column that has a very uneven distribution of values.  ~95% of the values will be the same, with some long tail of another few dozens of values.
> 
> I want to have an index over this value.  Queries that select the most common value will not use the index, because it is a overwhelming percentage of the table.  This means that ~95% of the disk space and IOPS to maintain the index is "wasted".
> 
> I cannot use a hardcoded partial index because:
> 1) The common value is not known at schema definition time, and may change (very slowly) over time.
> 
> 
> I think this is going to turn into a game of whack-a-mole.  There is going to have to be some transition period during which all or most of the rows need to be indexed.  So that amount of space is going to have to be available, and given that it is available, what advantage is there to using it some of the time and not using it some of the time? You can't feasibly use it for something else during the off periods, because then you will run into emergency out of space situations.

This is a good point.

I could define it further to e.g. only take values which make up < 10% of the table, for example, which may solve this problem.

But from the responses I've been getting it sounds like there's nothing standard that will "just solve my problem" and it's not worth the effort to me to cook up something intelligent right now.

>  
>  
> 2) JDBC uses prepared statements for everything, and the value to be selected is not known at statement prepare time, so any partial indices are ignored (this is a really really obnoxious behavior and makes partial indices almost useless combined with prepared statements, sadly…)
> 
> 
> What version are you using?  This has been improved in 9.2.0.

Thank goodness!  We are in the process of migrating to the 9.2 branch now, so I am thrilled to get this.  I'm sorry I didn't mention the version earlier, funny how the simplest things escape your mind when you're trying to make a good mailing list post...

Thanks again for all the input.



-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux