Re: TB-sized databases

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Simon Riggs wrote:
On Tue, 2007-11-27 at 18:06 -0500, Pablo Alcaraz wrote:
Simon Riggs wrote:
All of those responses have cooked up quite a few topics into one. Large
databases might mean text warehouses, XML message stores, relational
archives and fact-based business data warehouses.

The main thing is that TB-sized databases are performance critical. So
it all depends upon your workload really as to how well PostgreSQL, or
another other RDBMS vendor can handle them.


Anyway, my reason for replying to this thread is that I'm planning
changes for PostgreSQL 8.4+ that will make allow us to get bigger and
faster databases. If anybody has specific concerns then I'd like to hear
them so I can consider those things in the planning stages
it would be nice to do something with selects so we can recover a rowset on huge tables using a criteria with indexes without fall running a full scan.

In my opinion, by definition, a huge database sooner or later will have tables far bigger than RAM available (same for their indexes). I think the queries need to be solved using indexes enough smart to be fast on disk.

OK, I agree with this one.
I'd thought that index-only plans were only for OLTP, but now I see they
can also make a big difference with DW queries. So I'm very interested
in this area now.

If that's true, then you want to get behind the work Gokulakannan Somasundaram (http://archives.postgresql.org/pgsql-hackers/2007-10/msg00220.php) has done with relation to thick indexes. I would have thought that concept particularly useful in DW. Only having to scan indexes on a number of join tables would be a huge win for some of these types of queries.

My tiny point of view would say that is a much better investment than setting up the proposed parameter. I can see the use of the parameter though. Most of the complaints about indexes having visibility is about update /delete contention. I would expect in a DW that those things aren't in the critical path like they are in many other applications. Especially with partitioning and previous partitions not getting may updates, I would think there could be great benefit. I would think that many of Pablo's requests up-thread would get significant performance benefit from this type of index. But as I mentioned at the start, that's my tiny point of view and I certainly don't have the resources to direct what gets looked at for PostgreSQL.

Regards

Russell Smith


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux